No optimization with a partition window in a view
| От | Pierre |
|---|---|
| Тема | No optimization with a partition window in a view |
| Дата | |
| Msg-id | 6539554.0HoVISsQxY@peanuts2 обсуждение исходный текст |
| Ответы |
Re: No optimization with a partition window in a view
|
| Список | pgsql-bugs |
Hello
I tried to avoid implementing window functions inside my ORM by using a=
view,=20
but it seems the optimizer is missing an obvious optimization and thus =
doing a=20
full table scan.
Affected versions : 9.2.4 and 9.3.2 (9.4 not tested yet)
How to reproduce :
=3D> create table test_history (i serial, piece integer not null, date =
timestamp=20
with time zone default now(), location integer not null);
=3D> create table test_history (i serial, piece integer not null, date =
timestamp=20
with time zone default now(), location integer not null);
CREATE TABLE
=3D> insert into test_history (piece, location) select i, 1 from=20
generate_series(1, 1000000) i;
INSERT 0 1000000
=3D> insert into test_history (piece, location) select i, 2 from=20
generate_series(1, 1000000) i;
INSERT 0 1000000
=3D> insert into test_history (piece, location) select i, 3 from=20
generate_series(1, 1000000) i;
^[[AINSERT 0 1000000
=3D> insert into test_history (piece, location) select i, 4 from=20
generate_series(1, 1000000, 2) i;
INSERT 0 500000
=3D> alter table test_history add primary key(i);
ALTER TABLE
=3D> create index on test_history(piece);
CREATE INDEX
=3D> select * from test_history where piece =3D 42;
i | piece | date | location=20
=2D--------+-------+-------------------------------+----------
42 | 42 | 2014-02-15 08:52:50.946586+01 | 1
1000042 | 42 | 2014-02-15 08:52:56.634685+01 | 2
2000042 | 42 | 2014-02-15 08:53:00.762706+01 | 3
(3 rows)
Time: 0.158 ms
=3D> select *, lag(location, 1) over w, lead(location, 1) over w from t=
est_history=20
where piece =3D 42 window w as (partition by piece order by date);
i | piece | date | location | lag | lea=
d=20
=2D--------+-------+-------------------------------+----------+-----+----=
=2D-
42 | 42 | 2014-02-15 08:52:50.946586+01 | 1 | | =
2
1000042 | 42 | 2014-02-15 08:52:56.634685+01 | 2 | 1 | =
3
2000042 | 42 | 2014-02-15 08:53:00.762706+01 | 3 | 2 | =
=20
(3 rows)
Time: 0.203 ms
=3D> create view test_history_lag_lead as select *, lag(location, 1) ov=
er w,=20
lead(location, 1) over w from test_history window w as (partition by pi=
ece order=20
by date);
CREATE VIEW
=3D> select * from test_history_lag_lead where piece =3D 42;
i | piece | date | location | lag | lea=
d=20
=2D--------+-------+-------------------------------+----------+-----+----=
=2D-
42 | 42 | 2014-02-15 08:52:50.946586+01 | 1 | | =
2
1000042 | 42 | 2014-02-15 08:52:56.634685+01 | 2 | 1 | =
3
2000042 | 42 | 2014-02-15 08:53:00.762706+01 | 3 | 2 | =
=20
(3 rows)
Time: 2915.756 ms
=3D> explain analyze select * from test_history_lag_lead where piece =3D=
42;
QUERY P=
LAN =20
=2D----------------------------------------------------------------------=
=2D-------------------------------------------------------------------
Subquery Scan on test_history_lag_lead (cost=3D653058.16..775558.16 r=
ows=3D4=20
width=3D28) (actual time=3D1411.423..3375.794 rows=3D3 loops=3D1)
Filter: (test_history_lag_lead.piece =3D 42)
Rows Removed by Filter: 3499997
-> WindowAgg (cost=3D653058.16..731808.16 rows=3D3500000 width=3D2=
0) (actual=20
time=3D1411.343..3206.959 rows=3D3500000 loops=3D1)
-> Sort (cost=3D653058.16..661808.16 rows=3D3500000 width=3D=
20) (actual=20
time=3D1411.337..1867.895 rows=3D3500000 loops=3D1)
Sort Key: test_history.piece, test_history.date
Sort Method: external merge Disk: 116328kB
-> Seq Scan on test_history (cost=3D0.00..57293.00 row=
s=3D3500000=20
width=3D20) (actual time=3D0.004..310.558 rows=3D3500000 loops=3D1)
Total runtime: 3386.455 ms
(9 rows)
=3D> explain analyze select *, lag(location, 1) over w, lead(location, =
1) over w=20
from=20test_history where piece =3D 42 window w as (partition by piece or=
der by=20
date);
QUE=
RY PLAN =
=20
=2D----------------------------------------------------------------------=
=2D----------------------------------------------------------------------=
=2D----
WindowAgg (cost=3D19.70..19.78 rows=3D4 width=3D20) (actual time=3D0.=
018..0.019 rows=3D3=20
loops=3D1)
-> Sort (cost=3D19.70..19.71 rows=3D4 width=3D20) (actual time=3D0=
.015..0.015=20
rows=3D3 loops=3D1)
Sort Key: date
Sort Method: quicksort Memory: 25kB
-> Index Scan using test_history_piece_idx on test_history =20=
(cost=3D0.43..19.66 rows=3D4 width=3D20) (actual time=3D0.009..0.010 ro=
ws=3D3 loops=3D1)
Index Cond: (piece =3D 42)
Total runtime: 0.037 ms
(7 rows)
As you can see, the optimizer decided to do the WindowAgg first on the =
3,5M=20
lines before filtering, instead of filtering using an Index scan and th=
en doing=20
the aggregation.
IMHO, that optimization would be a special case possible only with wind=
ows on a=20
partition that is contained in the filtering fields, but I aint no expe=
rt here=20
:)
Thanks
Pierre Ducroquet
В списке pgsql-bugs по дате отправления: