Re: Why don't we consider explicit Incremental Sort?
От | Richard Guo |
---|---|
Тема | Re: Why don't we consider explicit Incremental Sort? |
Дата | |
Msg-id | CAMbWs4-3Z-pcHCWq3gBVHp=43kqaF3Quhwe5cUg-x9O54Lv2CA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Why don't we consider explicit Incremental Sort? (Tomas Vondra <tomas@vondra.me>) |
Список | pgsql-hackers |
On Mon, Sep 23, 2024 at 10:01 PM Tomas Vondra <tomas@vondra.me> wrote: > You don't need any skew. Consider this perfectly uniform dataset: > > create table t (a int, b int); > insert into t select 100000 * random(), 100 * random() > from generate_series(1,1000000) s(i); > create index on t (a); > vacuum analyze; > checkpoint; > > explain analyze select * from t order by a, b; I think if we want to compare the performance of incremental sort vs. full sort on this dataset, we need to ensure that other variables are kept constant, ie we need to ensure that both methods use the same subpath, whether it's Index Scan or Seq Scan. This is especially true in the scenario addressed by this patch, as for a mergejoin, we only consider outerrel's cheapest_total_path when we assume that an explicit sort atop is needed. That is to say, the subpath has already been chosen when it comes to determine whether to use incremental sort or full sort. According to this theory, here is what I got on this same dataset: -- incremental sort explain analyze select * from t order by a, b; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------ Incremental Sort (cost=1.02..68838.65 rows=1000000 width=8) (actual time=1.292..1564.265 rows=1000000 loops=1) Sort Key: a, b Presorted Key: a Full-sort Groups: 27022 Sort Method: quicksort Average Memory: 25kB Peak Memory: 25kB -> Index Scan using t_a_idx on t (cost=0.42..37244.20 rows=1000000 width=8) (actual time=0.408..1018.785 rows=1000000 loops=1) Planning Time: 0.998 ms Execution Time: 1602.861 ms (7 rows) -- full sort set enable_incremental_sort to off; set enable_seqscan to off; explain analyze select * from t order by a, b; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------ Sort (cost=150576.54..153076.54 rows=1000000 width=8) (actual time=1760.090..1927.598 rows=1000000 loops=1) Sort Key: a, b Sort Method: external merge Disk: 17704kB -> Index Scan using t_a_idx on t (cost=0.42..37244.20 rows=1000000 width=8) (actual time=0.531..1010.931 rows=1000000 loops=1) Planning Time: 0.980 ms Execution Time: 1970.287 ms (6 rows) So incremental sort outperforms full sort on this dataset. Thanks Richard
В списке pgsql-hackers по дате отправления: