Re: Re: Abbreviated keys for Datum tuplesort

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: Re: Abbreviated keys for Datum tuplesort
Дата
Msg-id 54E79F9C.4090208@2ndquadrant.com
обсуждение исходный текст
Ответ на Re: Re: Abbreviated keys for Datum tuplesort  (Andrew Gierth <andrew@tao11.riddles.org.uk>)
Ответы Re: Re: Abbreviated keys for Datum tuplesort
Список pgsql-hackers
On 25.1.2015 12:15, Andrew Gierth wrote:
>
> So given some suitable test data, such as
> 
> create table stuff as select random()::text as randtext
>   from generate_series(1,1000000);  -- or however many rows
> 
> you can do
> 
> select percentile_disc(0) within group (order by randtext) from stuff;
> 
> or
> 
> select count(distinct randtext) from stuff;
> 
> The performance improvements I saw were pretty much exactly as
> expected from the improvement in the ORDER BY and CREATE INDEX cases.

I've spent a fair amount of testing this today, and when using the
simple percentile_disc example mentioned above, I see this pattern:
                                master   patched   speedup  ---------------------------------------------------------
generate_series(1,1000000)     4.2       0.7      6   generate_series(1,2000000)      9.2       9.8      0.93
generate_series(1,3000000)    14.5      15.3      0.95
 


so for a small dataset the speedup is very nice, but for larger sets
there's ~5% slowdown. Is this expected?


-- 
Tomas Vondra                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: POLA violation with \c service=
Следующее
От: Peter Geoghegan
Дата:
Сообщение: Re: failures with tuplesort and ordered set aggregates (due to 5cefbf5a6c44)