Re: GROUP BY ROLLUP queries on views trigger full table scans (index usage not optimized)
| От | Richard Guo |
|---|---|
| Тема | Re: GROUP BY ROLLUP queries on views trigger full table scans (index usage not optimized) |
| Дата | |
| Msg-id | CAMbWs48=zSwZEbqX4-A1jtJqWuPxu8xbkQbu+CdtHg5anM9RVg@mail.gmail.com обсуждение исходный текст |
| Ответ на | GROUP BY ROLLUP queries on views trigger full table scans (index usage not optimized) ("Haowu Ge" <gehaowu@bitmoe.com>) |
| Ответы |
Re: GROUP BY ROLLUP queries on views trigger full table scans (index usage not optimized)
|
| Список | pgsql-bugs |
On Mon, Nov 17, 2025 at 5:02 PM Haowu Ge <gehaowu@bitmoe.com> wrote: > # 8. Query the view table to compute the average unit price for material 'PI',using a simple GROUP BY (rollup) > EXPLAIN ANALYZE > SELECT material_id, AVG(unit_price) as avg_price > FROM materials_view > WHERE material_id = 'PI' > GROUP BY ROLLUP(material_id); > GroupAggregate (cost=0.00..1097.39 rows=251 width=64) (actual time=3.983..3.985 rows=2.00 loops=1) > Group Key: materials.material_id > Group Key: () > Buffers: shared hit=468 > -> Seq Scan on materials (cost=0.00..1093.00 rows=250 width=48) (actual time=0.018..3.970 rows=1.00 loops=1) > Filter: ((material_id)::text = 'PI'::text) > Rows Removed by Filter: 49999 > Buffers: shared hit=468 > Planning Time: 0.117 ms > Execution Time: 4.029 ms Thanks for the report. The reason why this query cannot use an index scan is that, when the parent query uses grouping sets, all subquery outputs are wrapped in PlaceHolderVars. This is to ensure that these expressions retain their separate identity so that they can match to the expected grouping set columns (cf. cc5d98525). However, it also prevents the WHERE clause from matching the index column, which is why the index cannot be used. Prior to v18, we wrapped only the subquery outputs that were non-Var expressions. That is why this query was able to use an index scan. However, that behavior could produce incorrect results when the subquery's targetlist contained two or more identical Var expressions. In addition, if the targetlist included any non-Var expressions, we could also fail to match the WHERE clause to the corresponding index expressions. After looking at this query more closely, it seems that there is no need to wrap the expressions in the WHERE clause in the first place. It should be sufficient to wrap only the expressions that are used logically above the aggregation step, ie, those in the targetlist and in the havingQual. However, pullup_replace_vars currently lacks the infrastructure to do that. Another possible fix is to detect whether the subquery's targetlist contains two or more identical Var expressions and, if not, fall back to wrapping only the non-Var expressions. This would restore the behavior that existed prior to v18 for this query. Alternatively, we could teach match_index_to_operand to unwrap PHV expressions when matching them to indexes. This would also allow the WHERE clause to match indexes for non-Var expressions, which is an improvement over the behavior prior to v18. But I haven't fully thought through whether this approach is safe in all cases, particularly when outer joins are involved. Any thoughts? - Richard
В списке pgsql-bugs по дате отправления: