sql - SELECT fixed number of rows by evenly skipping rows -


i trying write query returns arbitrary sized representative sample of data. selecting nth rows n such entire result set close possible arbitrary size.

i want work in cases result set less arbitrary size. in such case, entire result set should returned.

i found this question shows how select every nth row.


here have far:

select * (    select *, ((row_number() on (order "time"))                % ceil(count(*)::float / 500::float)::bigint) rn    data_raw) sa sa.rn=0; 

this results in following error:

error: column "data_raw.serial" must appear in group clause or used in aggregate function position: 23


removing calculation n works:

select * (    select *, (row_number() on (order "time"))               % 50 rn data_raw) sa limit 500; 


tried moving calculation clause:

select * (    select *, (row_number() on (order "time")) rn    data_raw) sa (sa.rn % ceil(count(*)::float / 500::float)::bigint)=0; 

that results in error:

error: aggregate functions not allowed in position: 108


have ideas on either how fix query or better way this?

i have thought using random numbers , probability select rows, rather deterministic without possibility of clumping.

the mistake in first attempt can't mix aggregate function count(*) un-aggregated selection of rows. can fix using count() window-aggregate function instead:

select * (    select *, ((row_number() on (order "time"))                % ceil(count(*) on () / 500.0)::int) rn      data_raw    ) sub sub.rn = 0;

detailed explanation here:

@alexander has fix last attempt.


Comments

Popular posts from this blog

java - Oracle EBS .ClassNotFoundException: oracle.apps.fnd.formsClient.FormsLauncher.class ERROR -

c# - how to use buttonedit in devexpress gridcontrol -

How do you convert a timestamp into a datetime in python with the correct timezone? -