Re: GROUP BY ROLLUP queries on views trigger full table scans (index usage not optimized)

Поиск
Список
Период
Сортировка
От Richard Guo
Тема Re: GROUP BY ROLLUP queries on views trigger full table scans (index usage not optimized)
Дата
Msg-id CAMbWs48=zSwZEbqX4-A1jtJqWuPxu8xbkQbu+CdtHg5anM9RVg@mail.gmail.com
обсуждение исходный текст
Ответ на GROUP BY ROLLUP queries on views trigger full table scans (index usage not optimized)  ("Haowu Ge" <gehaowu@bitmoe.com>)
Ответы Re: GROUP BY ROLLUP queries on views trigger full table scans (index usage not optimized)
Список pgsql-bugs
On Mon, Nov 17, 2025 at 5:02 PM Haowu Ge <gehaowu@bitmoe.com> wrote:
> # 8. Query the view table to compute the average unit price for material 'PI',using a simple GROUP BY (rollup)
> EXPLAIN ANALYZE
> SELECT material_id, AVG(unit_price) as avg_price
> FROM materials_view
> WHERE material_id = 'PI'
> GROUP BY ROLLUP(material_id);

> GroupAggregate  (cost=0.00..1097.39 rows=251 width=64) (actual time=3.983..3.985 rows=2.00 loops=1)
>   Group Key: materials.material_id
>   Group Key: ()
>   Buffers: shared hit=468
>   ->  Seq Scan on materials  (cost=0.00..1093.00 rows=250 width=48) (actual time=0.018..3.970 rows=1.00 loops=1)
>         Filter: ((material_id)::text = 'PI'::text)
>         Rows Removed by Filter: 49999
>         Buffers: shared hit=468
> Planning Time: 0.117 ms
> Execution Time: 4.029 ms

Thanks for the report.  The reason why this query cannot use an index
scan is that, when the parent query uses grouping sets, all subquery
outputs are wrapped in PlaceHolderVars.  This is to ensure that these
expressions retain their separate identity so that they can match to
the expected grouping set columns (cf. cc5d98525).  However, it also
prevents the WHERE clause from matching the index column, which is why
the index cannot be used.

Prior to v18, we wrapped only the subquery outputs that were non-Var
expressions.  That is why this query was able to use an index scan.
However, that behavior could produce incorrect results when the
subquery's targetlist contained two or more identical Var expressions.
In addition, if the targetlist included any non-Var expressions, we
could also fail to match the WHERE clause to the corresponding index
expressions.

After looking at this query more closely, it seems that there is no
need to wrap the expressions in the WHERE clause in the first place.
It should be sufficient to wrap only the expressions that are used
logically above the aggregation step, ie, those in the targetlist and
in the havingQual.  However, pullup_replace_vars currently lacks the
infrastructure to do that.

Another possible fix is to detect whether the subquery's targetlist
contains two or more identical Var expressions and, if not, fall back
to wrapping only the non-Var expressions.  This would restore the
behavior that existed prior to v18 for this query.

Alternatively, we could teach match_index_to_operand to unwrap PHV
expressions when matching them to indexes.  This would also allow the
WHERE clause to match indexes for non-Var expressions, which is an
improvement over the behavior prior to v18.  But I haven't fully
thought through whether this approach is safe in all cases,
particularly when outer joins are involved.

Any thoughts?

- Richard



В списке pgsql-bugs по дате отправления: