Is postgres able to share sorts required by common partition window functions?

Поиск
Список
Период
Сортировка
От Sebastien Arod
Тема Is postgres able to share sorts required by common partition window functions?
Дата
Msg-id CADd42iFZWwYNsXjEM_3HWK3QnfiCrMNmpOkZqyBQCabnVxOPtw@mail.gmail.com
обсуждение исходный текст
Ответы Re: Is postgres able to share sorts required by common partition window functions?  (Michael Lewis <mlewis@entrata.com>)
Re: Is postgres able to share sorts required by common partition window functions?  ("David G. Johnston" <david.g.johnston@gmail.com>)
Список pgsql-general
Hi all,

I'm trying to optimize the following query on postgres 11.6 (running on Aurora)
select distinct
  c1,
  first_value(c2) OVER (PARTITION BY c1 order by c2) AS c2,
  first_value(c3) OVER (PARTITION BY c1 order by c3) AS c3,
  first_value(c4) OVER (PARTITION BY c1 order by c4) AS c4
from  
  t;

 
From the explain plan (attached at the end of the email) I see that postgresql is doing several sorts one per window function and one for the distinct that seems ok.
However all the window functions being on the same partition I would have expected postgresql to "share" a preliminary sort on c1 that would then be useful to reduce the work on all window functions but it doesn't.
I even created an index on c1 hoping that postgresql would be able to use it in order to minimize the cost of the sorts but I couldn't make it use it.

Is there something I am missing?

You can find below a script to set up a table and data to reproduce as well as the explain plan.

Setup Script
create table t(
pk varchar(200) PRIMARY key,
c1 varchar(200),
c2 varchar(200),
c3 varchar(200),
c4 varchar(200)
);
create index i1 on t (c1);

insert into t
(pk, c1, c2, c3, c4 )
select
 generate_series::text pk,
 'Grp' ||(generate_series / 4)::text c1,
 generate_series::text c2,
 generate_series::text c3,
 generate_series::text c4
from generate_series(0, 1000000);

Explain Plan
Unique  (cost=808480.87..820980.88 rows=1000001 width=123) (actual time=7131.675..7781.082 rows=250001 loops=1)
  ->  Sort  (cost=808480.87..810980.87 rows=1000001 width=123) (actual time=7131.673..7603.926 rows=1000001 loops=1)
        Sort Key: c1, (first_value(c2) OVER (?)), (first_value(c3) OVER (?)), (first_value(c4) OVER (?))
        Sort Method: external merge  Disk: 59640kB
        ->  WindowAgg  (cost=558937.90..578937.92 rows=1000001 width=123) (actual time=5179.374..6268.937 rows=1000001 loops=1)
              ->  Sort  (cost=558937.90..561437.90 rows=1000001 width=91) (actual time=5179.355..5679.136 rows=1000001 loops=1)
                    Sort Key: c1, c4
                    Sort Method: external merge  Disk: 52912kB
                    ->  WindowAgg  (cost=336736.93..356736.95 rows=1000001 width=91) (actual time=3260.950..4389.116 rows=1000001 loops=1)
                          ->  Sort  (cost=336736.93..339236.93 rows=1000001 width=59) (actual time=3260.934..3778.385 rows=1000001 loops=1)
                                Sort Key: c1, c3
                                Sort Method: external merge  Disk: 46176kB
                                ->  WindowAgg  (cost=141877.96..161877.98 rows=1000001 width=59) (actual time=1444.692..2477.284 rows=1000001 loops=1)
                                      ->  Sort  (cost=141877.96..144377.96 rows=1000001 width=27) (actual time=1444.669..1906.993 rows=1000001 loops=1)
                                            Sort Key: c1, c2
                                            Sort Method: external merge  Disk: 39424kB
                                            ->  Seq Scan on t  (cost=0.00..18294.01 rows=1000001 width=27) (actual time=0.011..177.815 rows=1000001 loops=1)
Planning Time: 0.214 ms
Execution Time: 7839.646 ms

В списке pgsql-general по дате отправления:

Предыдущее
От: Christophe Courtois
Дата:
Сообщение: Pgbench : vacuum default behaviour history
Следующее
От: Michael Lewis
Дата:
Сообщение: Re: Apply LIMIT when computation is logically irrelevant