Re: tablesample performance

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: tablesample performance
Дата
Msg-id 24207.1476812072@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: tablesample performance  (Andy Colson <andy@squeakycode.net>)
Ответы Re: tablesample performance  (Simon Riggs <simon@2ndquadrant.com>)
Список pgsql-general
Andy Colson <andy@squeakycode.net> writes:
> On 10/18/2016 11:44 AM, Francisco Olarte wrote:
>> This should be faster, but to me it seems it does a different thing.

> Ah, yes, you're right, there is a bit of a difference there.

If you don't want to have an implicit bias towards earlier blocks,
I don't think that either standard tablesample method is really what
you want.

The contrib/tsm_system_rows tablesample method is a lot closer, in
that it will start at a randomly chosen block, but if you just do
"tablesample system_rows(1)" then you will always get the first row
in whichever block it lands on, so it's still not exactly unbiased.
Maybe you could select "tablesample system_rows(100)" or so and then
do the order-by-random trick on that sample.  This would be a lot
faster than selecting 100 random rows with either built-in sample
method, since the rows it grabs will be consecutive.

            regards, tom lane


В списке pgsql-general по дате отправления:

Предыдущее
От: Andy Colson
Дата:
Сообщение: Re: tablesample performance
Следующее
От: Francisco Olarte
Дата:
Сообщение: Re: tablesample performance