Re: PoC: Partial sort

Поиск
Список
Период
Сортировка
От Alexander Korotkov
Тема Re: PoC: Partial sort
Дата
Msg-id CAPpHfdtqfqm+WR7cQWK-f0+qD_F4VNuvPuyvjmR7xCxQODAXrw@mail.gmail.com
обсуждение исходный текст
Ответ на PoC: Partial sort  (Alexander Korotkov <aekorotkov@gmail.com>)
Ответы Re: PoC: Partial sort
Re: PoC: Partial sort
Список pgsql-hackers
Hi!

Next revision. It expected to do better work with optimizer. It introduces presorted_keys argument of cost_sort function which represent number of keys already sorted in Path. Then this function uses estimate_num_groups to estimate number of groups with different values of presorted keys and assumes that dataset is uniformly divided by groups. get_cheapest_fractional_path_for_pathkeys tries to select the path matching most part of path keys.
You can see it's working pretty good on single table queries.

create table test as (select id, (random()*5)::int as v1, (random()*1000)::int as v2 from generate_series(1,1000000) id);
create index test_v1_idx on test (v1);
create index test_v1_v2_idx on test (v1, v2);
create index test_v2_idx on test (v2);
vacuum analyze;

postgres=# explain analyze select * from test order by v1, id;
                                                      QUERY PLAN                                                       
-----------------------------------------------------------------------------------------------------------------------
 Sort  (cost=149244.84..151744.84 rows=1000000 width=12) (actual time=2111.476..2586.493 rows=1000000 loops=1)
   Sort Key: v1, id
   Sort Method: external merge  Disk: 21512kB
   ->  Seq Scan on test  (cost=0.00..15406.00 rows=1000000 width=12) (actual time=0.012..113.815 rows=1000000 loops=1)
 Total runtime: 2683.011 ms
(5 rows)

postgres=# explain analyze select * from test order by v1, id limit 10;
                                                                  QUERY PLAN                                                                   
-----------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=11441.77..11442.18 rows=10 width=12) (actual time=79.980..79.982 rows=10 loops=1)
   ->  Partial sort  (cost=11441.77..53140.44 rows=1000000 width=12) (actual time=79.978..79.978 rows=10 loops=1)
         Sort Key: v1, id
         Presorted Key: v1
         Sort Method: top-N heapsort  Memory: 25kB
         ->  Index Scan using test_v1_idx on test  (cost=0.42..47038.83 rows=1000000 width=12) (actual time=0.031..38.275 rows=100213 loops=1)
 Total runtime: 81.786 ms
(7 rows)

postgres=# explain analyze select * from test order by v1, v2 limit 10;
                                                              QUERY PLAN                                                               
---------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.42..0.90 rows=10 width=12) (actual time=0.031..0.047 rows=10 loops=1)
   ->  Index Scan using test_v1_v2_idx on test  (cost=0.42..47286.28 rows=1000000 width=12) (actual time=0.029..0.043 rows=10 loops=1)
 Total runtime: 0.083 ms
(3 rows)

postgres=# explain analyze select * from test order by v2, id;
                                                                QUERY PLAN                                                                 
-------------------------------------------------------------------------------------------------------------------------------------------
 Partial sort  (cost=97.75..99925.50 rows=1000000 width=12) (actual time=1.069..1299.481 rows=1000000 loops=1)
   Sort Key: v2, id
   Presorted Key: v2
   Sort Method: quicksort  Memory: 52kB
   ->  Index Scan using test_v2_idx on test  (cost=0.42..47603.79 rows=1000000 width=12) (actual time=0.030..812.083 rows=1000000 loops=1)
 Total runtime: 1393.850 ms
(6 rows)

However, work with joins needs more improvements.

------
With best regards,
Alexander Korotkov.
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: ALTER SYSTEM SET command to change postgresql.conf parameters
Следующее
От: Martijn van Oosterhout
Дата:
Сообщение: Re: PoC: Partial sort