Re: Selecting K random rows - efficiently!

Поиск
Список
Период
Сортировка
От Scott Marlowe
Тема Re: Selecting K random rows - efficiently!
Дата
Msg-id dcc563d10710242001o1f008a06u8ae684a67691e68b@mail.gmail.com
обсуждение исходный текст
Ответ на Selecting K random rows - efficiently!  (cluster <skrald@amossen.dk>)
Ответы Re: Selecting K random rows - efficiently!
Список pgsql-general
Here's how I would do it.  This assumes a static table that doesn't
change a lot.

1: find the row count n of the table.
2: randomly assign 1 through n to each row randomly.  How to do this
is a whole not post.
3: create a sequence.  If you always need 10 or 100 random rows,  set
the increment to that number.  set it to cycle at the size of the
table.
4: select nextval('sequence') =>nv and use it in a select:

select * from myrandomtable where id between nv and nv+100; --  or
whatever your increment is.

There are refinements to this.  The advantages, with a static data
set, are that you can cluster on the randomized id and get chunks of
the random dataset VERY quickly, and you won't repeat the results
until you start over.  you can re-randomize the table every x hours or
days or weeks to meet your needs.  If you don't want to re-randomize
it during the day, just put the random data set into it however many
times you need to so that it won't roll over until the next day/week
etc...

Does that make sense?

If your data changes all the time, you've got a more difficult problem
to deal with.

В списке pgsql-general по дате отправления:

Предыдущее
От: Joe Conway
Дата:
Сообщение: Re: Crosstab Problems
Следующее
От: Ron Johnson
Дата:
Сообщение: Re: Using Postgres as a "embedded" / SQL Lite database on Windows