Re: Window Function "Run Conditions"

Поиск
Список
Период
Сортировка
От Andy Fan
Тема Re: Window Function "Run Conditions"
Дата
Msg-id CAKU4AWpmDGMxQ3nJfj=awQAgcL-jP2p67MtpiHFZX77NaZ2hxQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Window Function "Run Conditions"  (Andy Fan <zhihui.fan1213@gmail.com>)
Ответы Re: Window Function "Run Conditions"
Re: Window Function "Run Conditions"
Список pgsql-hackers
Hi David: 


I just came to the office today to double check this patch.  I probably can
finish it very soon. 

I would share my current review result first and more review is still in progress.
There is a lot of amazing stuff there but I'd save the simple +1 and just share
something I'm not fully understand now.  I just focused on the execution part and
only 1 WindowAgg node situation right now. 

1. We can do more on PASSTHROUGH, we just bypass the window function
currently,  but IIUC we can ignore all of the following tuples in current partition 
once we go into this mode.  patch 0001 shows what I mean. 

--- without patch 0001,  we need 1653 ms for the below query, with the patch 0001, 
--- we need 629ms.   This is not a serious performance comparison since I
--- build software with -O0 and --enable_cassert.  but it can show some improvement.
postgres=# explain analyze select * from (select x,y,row_number() over (partition
by x order by y) rn from xy) as xy where rn < 2;
                                                                      QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
 Subquery Scan on xy  (cost=0.42..55980.43 rows=5000 width=16) (actual time=0.072..1653.631 rows=1000 loops=1)
   Filter: (xy.rn = 1)
   Rows Removed by Filter: 999000
   ->  WindowAgg  (cost=0.42..43480.43 rows=1000000 width=16) (actual time=0.069..1494.553 rows=1000000 loops=1)
         Run Condition: (row_number() OVER (?) < 2)
         ->  Index Only Scan using xy_x_y_idx on xy xy_1  (cost=0.42..25980.42 rows=1000000 width=8) (actual time=0.047..330.283 rows=1000000 loops=1)
               Heap Fetches: 0
 Planning Time: 0.240 ms
 Execution Time: 1653.913 ms
(9 rows)


postgres=# explain analyze select * from (select x,y,row_number() over (partition
by x order by y) rn from xy) as xy where rn < 2;
                                                                      QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
 Subquery Scan on xy  (cost=0.42..55980.43 rows=5000 width=16) (actual time=0.103..629.428 rows=1000 loops=1)
   Filter: (xy.rn < 2)
   Rows Removed by Filter: 1000
   ->  WindowAgg  (cost=0.42..43480.43 rows=1000000 width=16) (actual time=0.101..628.821 rows=2000 loops=1)
         Run Condition: (row_number() OVER (?) < 2)
         ->  Index Only Scan using xy_x_y_idx on xy xy_1  (cost=0.42..25980.42 rows=1000000 width=8) (actual time=0.063..281.715 rows=1000000 loops=1)
               Heap Fetches: 0
 Planning Time: 1.119 ms
 Execution Time: 629.781 ms
(9 rows)

Time: 633.241 ms

 
2. the "Rows Removed by Filter: 1000" is strange to me for the above example. 

 Subquery Scan on xy  (cost=0.42..55980.43 rows=5000 width=16) (actual time=0.103..629.428 rows=1000 loops=1)
   Filter: (xy.rn < 2)
   Rows Removed by Filter: 1000

The root cause is even ExecQual(winstate->runcondition, econtext) return false, we
still return the slot to the upper node.  A simple hack can avoid it.

3.  With the changes in 2,  I think we can avoid the subquery node totally for the above query.

4. If all the above are correct, looks the enum WindowAggStatus addition is not a
must since we can do what WINDOWAGG_PASSTHROUGH does just when we find it is, like
patch 3 shows.  (I leave WINDOWAGG_DONE only, but it can be replaced with
previous all_done field).

Finally, Thanks for the patch, it is a good material to study the knowledge
in this area. 

--
Best Regards
Andy Fan
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Kyotaro Horiguchi
Дата:
Сообщение: Re: standby recovery fails (tablespace related) (tentative patch and discussion)
Следующее
От: Andy Fan
Дата:
Сообщение: Re: Window Function "Run Conditions"