Обсуждение: JumbleQuery ma treat different GROUP BY expr as the same

Поиск
Список
Период
Сортировка

JumbleQuery ma treat different GROUP BY expr as the same

От
jian he
Дата:
hi.

drop table if exists t;
create table t(a text, b text, c int);
SELECT pg_stat_statements_reset() IS NOT NULL AS t;
explain(costs off, verbose) select count(*) from t group by a;
explain(costs off, verbose) select count(*) from t group by b;
explain(costs off, verbose) select count(*) from t group by c;

SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C";
 calls | rows |                                    query
-------+------+------------------------------------------------------------------------------
     0 |    0 | SELECT calls, rows, query FROM pg_stat_statements
ORDER BY query COLLATE "C"
     1 |    1 | SELECT pg_stat_statements_reset() IS NOT NULL AS t
     2 |    0 | explain(costs off, verbose) select count(*) from t group by a
     2 |    0 | explain(costs off, verbose) select count(*) from t group by a;
     1 |    0 | explain(costs off, verbose) select count(*) from t group by c
     1 |    0 | explain(costs off, verbose) select count(*) from t group by c;
(6 rows)

transformSelectStmt->transformGroupClause->transformGroupClauseExpr->addTargetToGroupList
will produce the same SortGroupClause node for "group by a" and "group by b".

JumbleQuery will jumble Query->groupClause, but RangeTblEntry->groupexprs in
Query->rtable is marked with query_jumble_ignore and therefore excluded from
jumbling.

So  "group by a" and "group by" merged into the same entry in
pg_stat_statements,
Is this what we expected?



--
jian
https://www.enterprisedb.com/



Re: JumbleQuery ma treat different GROUP BY expr as the same

От
Tom Lane
Дата:
jian he <jian.universality@gmail.com> writes:
> explain(costs off, verbose) select count(*) from t group by a;
> explain(costs off, verbose) select count(*) from t group by b;
> explain(costs off, verbose) select count(*) from t group by c;

> JumbleQuery will jumble Query->groupClause, but RangeTblEntry->groupexprs in
> Query->rtable is marked with query_jumble_ignore and therefore excluded from
> jumbling.

> So  "group by a" and "group by" merged into the same entry in
> pg_stat_statements,
> Is this what we expected?

It is not what happened before we invented RTE_GROUP.  I tried your
experiment in v14 and got:

regression=# SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C";
 calls | rows |                             query
-------+------+---------------------------------------------------------------
     1 |    1 | SELECT pg_stat_statements_reset() IS NOT NULL AS t
     1 |    0 | explain(costs off, verbose) select count(*) from t group by a
     1 |    0 | explain(costs off, verbose) select count(*) from t group by b
     1 |    0 | explain(costs off, verbose) select count(*) from t group by c
(4 rows)

So I'm inclined to think this was an unintentional change of behavior.

            regards, tom lane



Re: JumbleQuery ma treat different GROUP BY expr as the same

От
Michael Paquier
Дата:
On Sat, Jan 10, 2026 at 11:46:27AM -0500, Tom Lane wrote:
> It is not what happened before we invented RTE_GROUP.  I tried your
> experiment in v14 and got:
>
> regression=# SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C";
>  calls | rows |                             query
> -------+------+---------------------------------------------------------------
>      1 |    1 | SELECT pg_stat_statements_reset() IS NOT NULL AS t
>      1 |    0 | explain(costs off, verbose) select count(*) from t group by a
>      1 |    0 | explain(costs off, verbose) select count(*) from t group by b
>      1 |    0 | explain(costs off, verbose) select count(*) from t group by c
> (4 rows)
>
> So I'm inclined to think this was an unintentional change of behavior.

The difference of behavior is between v17 and v18, as an effect of
247dea89f761.  I know that we should not break query ID computations
in stable branches, but v18 is very recent and we still have many
years to support it..  So I would like to suggest that we make an
exception and backpatch a fix to v18.  This behavior is not cool for
users.

This issue also points to a gap in the regression of
pg_stat_statements, where we have never bothered testing patterns of
GROUP BY with the same table and different attributes.  Jian, would
you like to write a patch?
--
Michael

Вложения

Re: JumbleQuery ma treat different GROUP BY expr as the same

От
jian he
Дата:
On Sun, Jan 11, 2026 at 6:13 PM Michael Paquier <michael@paquier.xyz> wrote:
>
> This issue also points to a gap in the regression of
> pg_stat_statements, where we have never bothered testing patterns of
> GROUP BY with the same table and different attributes.  Jian, would
> you like to write a patch?
> --
> Michael

Sure, I will give it a try over the next few days.



Re: JumbleQuery ma treat different GROUP BY expr as the same

От
Tom Lane
Дата:
Michael Paquier <michael@paquier.xyz> writes:
> On Sat, Jan 10, 2026 at 11:46:27AM -0500, Tom Lane wrote:
>> So I'm inclined to think this was an unintentional change of behavior.

> The difference of behavior is between v17 and v18, as an effect of
> 247dea89f761.  I know that we should not break query ID computations
> in stable branches, but v18 is very recent and we still have many
> years to support it..  So I would like to suggest that we make an
> exception and backpatch a fix to v18.  This behavior is not cool for
> users.

I agree we should fix it in v18, but we really need to push to have
the fix in 18.2 rather than delay longer.  (IME, the .2 release is
about when our more risk-averse users start to think about upgrading.)
So there's a time limit on getting this done.

            regards, tom lane