Re: PoC: Partial sort
От | Alexander Korotkov |
---|---|
Тема | Re: PoC: Partial sort |
Дата | |
Msg-id | CAPpHfdvHBXX5CVsbt8a8ao3RomHe+WZ3Ph06_gHSr93pQo1aXw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: PoC: Partial sort (Andreas Karlsson <andreas@proxel.se>) |
Ответы |
Re: PoC: Partial sort
|
Список | pgsql-hackers |
On Tue, Dec 24, 2013 at 6:02 AM, Andreas Karlsson <andreas@proxel.se> wrote:
------
With best regards,
Alexander Korotkov.
On 12/22/2013 04:38 PM, Alexander Korotkov wrote:Have you thought about how do you plan to print which sort method and how much memory was used? Several different sort methods may have been use in the query. Should the largest amount of memory/disk be printed?postgres=# explain analyze select * from test order by v1, id limit 10;
QUERY
PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=11441.77..11442.18 rows=10 width=12) (actual
time=79.980..79.982 rows=10 loops=1)
-> Partial sort (cost=11441.77..53140.44 rows=1000000 width=12)
(actual time=79.978..79.978 rows=10 loops=1)
Sort Key: v1, id
Presorted Key: v1
Sort Method: top-N heapsort Memory: 25kB
-> Index Scan using test_v1_idx on test (cost=0.42..47038.83
rows=1000000 width=12) (actual time=0.031..38.275 rows=100213 loops=1)
Total runtime: 81.786 ms
(7 rows)
Apparently, now amount of memory for sorted last group is printed. Your proposal makes sense: largest amount of memory/disk should be printed.
That would be really nice to have, but the patch seems useful even without the improvements to joins.However, work with joins needs more improvements.
Attached revision of patch implements partial sort usage in merge joins.
create table test1 as (
select id,
(random()*100)::int as v1,
(random()*10000)::int as v2
from generate_series(1,1000000) id);
create table test2 as (
select id,
(random()*100)::int as v1,
(random()*10000)::int as v2
from generate_series(1,1000000) id);
create index test1_v1_idx on test1 (v1);
create index test2_v1_idx on test2 (v1);
create index test1_v1_idx on test1 (v1);
create index test2_v1_idx on test2 (v1);
# explain select * from test1 t1 join test2 t2 on t1.v1 = t2.v1 and t1.v2 = t2.v2;
QUERY PLAN
----------------------------------------------------------------------------------------------------------
Merge Join (cost=2257.67..255273.39 rows=983360 width=24)
Merge Cond: ((t1.v1 = t2.v1) AND (t1.v2 = t2.v2))
-> Partial sort (cost=1128.84..116470.79 rows=1000000 width=12)
Sort Key: t1.v1, t1.v2
Presorted Key: t1.v1
-> Index Scan using test1_v1_idx on test1 t1 (cost=0.42..47604.01 rows=1000000 width=12)
-> Materialize (cost=1128.83..118969.00 rows=1000000 width=12)
-> Partial sort (cost=1128.83..116469.00 rows=1000000 width=12)
Sort Key: t2.v1, t2.v2
Presorted Key: t2.v1
-> Index Scan using test2_v1_idx on test2 t2 (cost=0.42..47602.22 rows=1000000 width=12)
I believe now patch covers desired functionality. I'm going to focus on nailing down details, refactoring and documenting.
With best regards,
Alexander Korotkov.
Вложения
В списке pgsql-hackers по дате отправления:
Предыдущее
От: "MauMau"Дата:
Сообщение: Re: [bug fix] connection service file doesn't take effect with ECPG apps