Re: Window function optimisation, allow pushdowns of items matching PARTITION BY clauses

Поиск
Список
Период
Сортировка
От Thomas Mayer
Тема Re: Window function optimisation, allow pushdowns of items matching PARTITION BY clauses
Дата
Msg-id 534AA430.9020105@student.kit.edu
обсуждение исходный текст
Ответ на Window function optimisation, allow pushdowns of items matching PARTITION BY clauses  (David Rowley <dgrowley@gmail.com>)
Ответы Re: Window function optimisation, allow pushdowns of items matching PARTITION BY clauses  (David Rowley <dgrowley@gmail.com>)
Список pgsql-hackers
Hello David,

thanks for your work. The results look promising.

What I'm missing is a test case with multiple fields in the partition by 
clauses:

-- should push down, because partid is part of all PARTITION BY clauses
explain analyze select partid,n,m from (  select partid,  count(*) over (partition by partid) n,  count(*) over
(partitionby partid, partid+0) m  from winagg
 
) winagg
where partid=1;

current production 9.3.4 is returning                          QUERY PLAN 


----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SubqueryScan on winagg  (cost=350955.11..420955.11 rows=20 width=20) 
 
(actual time=2564.360..3802.413 rows=20 loops=1)   Filter: (winagg.partid = 1)   Rows Removed by Filter: 1999980   ->
WindowAgg (cost=350955.11..395955.11 rows=2000000 width=4) 
 
(actual time=2564.332..3657.051 rows=2000000 loops=1)         ->  Sort  (cost=350955.11..355955.11 rows=2000000
width=4)
 
(actual time=2564.320..2802.444 rows=2000000 loops=1)               Sort Key: winagg_1.partid, ((winagg_1.partid + 0))
            Sort Method: external sort  Disk: 50840kB               ->  WindowAgg  (cost=0.43..86948.43 rows=2000000 
 
width=4) (actual time=0.084..1335.081 rows=2000000 loops=1)                     ->  Index Only Scan using
winagg_partid_idxon 
 
winagg winagg_1  (cost=0.43..51948.43 rows=2000000 width=4) (actual 
time=0.051..378.232 rows=2000000 loops=1)                           Heap Fetches: 0

"Index Only Scan" currently returns all rows (without pushdown) on 
current production 9.3.4. What happens with the patch you provided?

-- Already Part of your tests:
-- should NOT push down, because partid is NOT part of all PARTITION BY 
clauses
explain analyze select partid,n,m from (  select partid,  count(*) over (partition by partid) n,  count(*) over
(partitionby partid+0) m  from winagg
 
) winagg
where partid=1;

Reordering the fields should also be tested:
-- should push down, because partid is part of all PARTITION BY clauses
-- here: partid at the end
explain analyze select partid,n,m from (  select partid,  count(*) over (partition by partid) n,  count(*) over
(partitionby partid+0, partid) m  from winagg
 
) winagg
where partid=1;

-- should push down, because partid is part of all PARTITION BY clauses
-- here: partid in the middle
explain analyze select partid,n,m from (  select partid,  count(*) over (partition by partid) n,  count(*) over
(partitionby partid+0, partid, partid+1) m  from winagg
 
) winagg
where partid=1;


Best regards
Thomas


Am 13.04.2014 13:32, schrieb David Rowley:
> On this thread
> http://www.postgresql.org/message-id/52C6F712.6040804@student.kit.edu
> there was some discussion around allowing push downs of quals that
> happen to be in every window clause of the sub query. I've quickly put
> together a patch which does this (see attached)
>
> I'm posting this just mainly to let Thomas know that I'm working on it,
> per his request on the other thread.
>
> The patch seems to work with all my test cases, and I've not quite
> gotten around to thinking of any more good cases to throw at it.
>
> Oh and I know that my
> function var_exists_in_all_query_partition_by_clauses has no business in
> allpaths.c, I'll move it out as soon as I find a better home for it.
>
> Here's my test case:
>
> drop table if exists winagg;
>
> create table winagg (
>    id serial not null primary key,
>    partid int not null
> );
>
> insert into winagg (partid) select x.x % 100000 from
> generate_series(1,2000000) x(x);
>
>
> create index winagg_partid_idx on winagg(partid);
>
>
> -- Should push: this should push WHERE partid=1 to the inner query as
> partid is in the only parition by clause in the query.
> explain analyze select partid,n from (select partid,count(*) over
> (partition by partid) n from winagg) winagg where partid=1;
>                                                              QUERY PLAN
>
----------------------------------------------------------------------------------------------------------------------------------
>   WindowAgg  (cost=4.58..82.23 rows=20 width=4) (actual
> time=0.196..0.207 rows=20 loops=1)
>     ->  Bitmap Heap Scan on winagg  (cost=4.58..81.98 rows=20 width=4)
> (actual time=0.102..0.170 rows=20 loops=1)
>           Recheck Cond: (partid = 1)
>           Heap Blocks: exact=20
>           ->  Bitmap Index Scan on winagg_partid_idx  (cost=0.00..4.58
> rows=20 width=0) (actual time=0.084..0.084 rows=20 loops=1)
>                 Index Cond: (partid = 1)
>   Planning time: 0.208 ms
>   Total runtime: 0.276 ms
> (8 rows)
>
> -- Should not push: Added a +0 to partition by clause.
> explain analyze select partid,n from (select partid,count(*) over
> (partition by partid + 0) n from winagg) winagg where partid=1;
>                                                                   QUERY PLAN
>
---------------------------------------------------------------------------------------------------------------------------------------------
>   Subquery Scan on winagg  (cost=265511.19..330511.19 rows=20 width=12)
> (actual time=2146.642..4257.267 rows=20 loops=1)
>     Filter: (winagg.partid = 1)
>     Rows Removed by Filter: 1999980
>     ->  WindowAgg  (cost=265511.19..305511.19 rows=2000000 width=4)
> (actual time=2146.614..4099.169 rows=2000000 loops=1)
>           ->  Sort  (cost=265511.19..270511.19 rows=2000000 width=4)
> (actual time=2146.587..2994.993 rows=2000000 loops=1)
>                 Sort Key: ((winagg_1.partid + 0))
>                 Sort Method: external merge  Disk: 35136kB
>                 ->  Seq Scan on winagg winagg_1  (cost=0.00..28850.00
> rows=2000000 width=4) (actual time=0.025..418.306 rows=2000000 loops=1)
>   Planning time: 0.249 ms
>   Total runtime: 4263.933 ms
> (10 rows)
>
>
> -- Should not push: Add a window clause (which is not used) that has a
> partition by clause that does not have partid
> explain analyze select partid,n from (select partid,count(*) over
> (partition by partid) n from winagg window stopPushDown as (partition by
> id)) winagg where partid=1;
>
> -- Should not push: 1 window clause does not have partid
> explain analyze select partid,n from (select partid,count(*) over
> (partition by partid) n from winagg window stopPushDown as (order id))
> winagg where partid=1;
>
> -- Should not push: 1 window clause does not have partid
> explain analyze select partid,n from (select partid,count(*) over
> (partition by partid) n from winagg window stopPushDown as ()) winagg
> where partid=1;
>
> As of now the patch is a couple of hours old, I've not even bothered to
> run the regression tests yet, let alone add any new ones.
>
> Comments are welcome...
>
> Regards
>
> David Rowley
>

-- 
======================================
Thomas Mayer
Durlacher Allee 61
D-76131 Karlsruhe
Telefon: +49-721-2081661
Fax:     +49-721-72380001
Mobil:   +49-174-2152332
E-Mail:  thomas.mayer@student.kit.edu
=======================================




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: integrate pg_upgrade analyze_new_cluster.sh into vacuumdb
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Patch to fix a couple of compiler warnings from 80a5cf64