BUG #16533: Planner optimisation : range predicate not propagating to joined tables
От | PG Bug reporting form |
---|---|
Тема | BUG #16533: Planner optimisation : range predicate not propagating to joined tables |
Дата | |
Msg-id | 16533-a10051f0f1d8d72f@postgresql.org обсуждение исходный текст |
Список | pgsql-bugs |
The following bug has been logged on the website: Bug reference: 16533 Logged by: Andrew Barnham Email address: barney@c8software.com.au PostgreSQL version: 12.3 Operating system: Linux Description: Hi Included repeatable SQL of issue below. I am trying to paginate data using limit and value range offset. Ordinarily a query like this: select * from tmp_parent left outer join tmp_child on (tmp_parent.id=tmp_child.id) where tmp_parent.id=8000000; Would copy the predicate (id=8000000) to the child. So resulting plan uses a merge and in the merge like so: -> Index Scan using tmp_a_idx on tmp_parent (cost=0.42..8.44 rows=1 width=10) (actual time=0.023..0.023 rows=0 loops=1) Index Cond: (id = 8000000) -> Index Scan using tmp_b_idx on tmp_child (cost=0.29..8.31 rows=1 width=10) (never executed) Index Cond: (id = 8000000) **** PREDICATE COPIED TO CHILD TABLE ALL OK But if I have query like this: select * from tmp_parent left outer join tmp_child on (tmp_parent.id=tmp_child.id) where tmp_parent.id>=8000000 limit 100; Resulting plan does not apply a index cond to the child table and it merge scans, from start of childs index needlessly reading 80k records -> Index Scan using tmp_a_idx on tmp_parent (cost=0.42..7008.03 rows=206606 width=10) (actual time=0.049..0.077 rows=100 loops=1) Index Cond: (id >= 8000000) -> Index Scan using tmp_b_idx on tmp_child (cost=0.29..3159.89 rows=100373 width=10) (actual time=0.013..28.105 rows=80346 loops=1) If I constrain ID range it is better, it reverts to a nested join although. -> Index Scan using tmp_a_idx on tmp_parent (cost=0.42..10.49 rows=103 width=10) (actual time=0.022..0.065 rows=100 loops=1) Index Cond: ((id >= 8000000) AND (id <= 8001000)) -> Index Scan using tmp_b_idx on tmp_child (cost=0.29..7.38 rows=1 width=10) (actual time=0.002..0.002 rows=0 loops=100) Index Cond: (tmp_parent.id = id) **** PRETTY GOOD BUT NEEDS TO WALK INTO THE INDEX 100 TIMES I can get a great result by moving the predicate into the join. But this isn't really desirable because in my real world query I am trying to abstract a very complex query behind a view, but has performance issue that boil down to this repeatable example, of propagation of the predicate select * from tmp_parent left outer join tmp_child on (tmp_parent.id=tmp_child.id and tmp_child.id>=8000000) where tmp_parent.id>=8000000 order by tmp_parent.id limit 100; Merge Cond: (tmp_parent.id = tmp_child.id) -> Index Scan using tmp_a_idx on tmp_parent (cost=0.42..7008.03 rows=206606 width=10) (actual time=0.019..0.079 rows=100 loops=1) Index Cond: (id >= 8000000) -> Index Scan using tmp_b_idx on tmp_child (cost=0.29..686.24 rows=19997 width=10) (actual time=0.018..0.022 rows=14 loops=1) Index Cond: (id >= 8000000) **** IDEAL QUERY PLAN With thanks Andrew ======= create temporary table tmp_parent as select generate_series(1,1000000)*10+floor(random()*10)::int as "id",'Text'||((random()*100)::integer) as "description"; create temporary table tmp_child as select * from tmp_parent where random()<0.1; create unique index tmp_a_idx on tmp_parent (id); create unique index tmp_b_idx on tmp_child (id); analyze tmp_parent; analyze tmp_child; explain analyze select * from tmp_parent left outer join tmp_child on (tmp_parent.id=tmp_child.id) where tmp_parent.id>=8000000 order by tmp_parent.id limit 100; explain analyze select * from tmp_parent left outer join tmp_child on (tmp_parent.id=tmp_child.id and tmp_child.id>=8000000) where tmp_parent.id>=8000000 order by tmp_parent.id limit 100;
В списке pgsql-bugs по дате отправления:
Предыдущее
От: Michael PaquierДата:
Сообщение: Re: BUG #16526: pg_test_fsync in v12 doesn't run in Windows