Обсуждение: Re: support fix query_id for temp table
Re: Michael Paquier > On Thu, Feb 01, 2024 at 07:37:32AM +0000, ma lz wrote: > > session 1: > > create temp table ttt ( a int ); > > insert into ttt values(3); -- query_id is XXX from pg_stat_activity > > > > session 2: > > create temp table ttt ( a int ); > > insert into ttt values(3); -- query_id is YYY from pg_stat_activity > > > > I know temp table has different oid, so query_id is different, is > > there a way to use table name for temp table instead of oid? > > The CREATE TABLE statements have indeed the same query ID (in 16~), > and the inserts have a different one as they touch different schemas > and relations. That's quite an old problem, that depends on the > RangeVar attached to an InsertStmt. I don't quite see a way to > directly handle that except by using a custom implementation in query > jumbling for this node and its RangeVar, so there is no "easy" way to > tackle that :/ A customer reported that pg_stat_statements is not useful for them because they are seeing 160k different query ids in 6-8 hours. They also proposed to use the temp table name for query jumbling and wrote a patch for it, which I would also see as the obvious solution to the problem. Here's that patch with regression tests added. I would think changing this would be a big usability improvement for anyone using temp tables a lot. There does not seem to be a performance impact - all test were run with pg_stat_statements active: Standard pgbench -S (-s 10): without patch: tps = 154155.407337 (without initial connection time) with patch: tps = 154223.966534 (without initial connection time) pgbench -S on temp tables where each table has just one record: without patch: tps = 184430.801954 (without initial connection time) with patch: tps = 185692.602764 (without initial connection time) Christoph
Вложения
On Mon, Mar 17, 2025 at 10:38:36PM +0100, Christoph Berg wrote:
> Here's that patch with regression tests added. I would think changing
> this would be a big usability improvement for anyone using temp tables
> a lot.
Not the first time I am seeing this argument this month.  It is the
second time.
+    /*
+     * If this is a temp table, jumble the name instead of the table oid.
+     */
+    if (expr->rtekind == RTE_RELATION && isAnyTempNamespace(get_rel_namespace(expr->relid)))
+    {
+        rel_name = get_rel_name(expr->relid);
+        AppendJumble(jstate, (const unsigned char *)rel_name, strlen(rel_name));
+    }
+    else
+        JUMBLE_FIELD(relid);
This is OK on its own, still feels a bit incomplete, as the relid also
includes an assumption about the namespace.  I would suggested to add
a hardcoded "pg_temp" here, to keep track of this assumption, at
least.
 typedef struct RangeTblEntry
 {
-    pg_node_attr(custom_read_write)
+    pg_node_attr(custom_read_write, custom_query_jumble)
This structure still includes some query_jumble_ignore, which are not
required once custom_query_jumble is added.
We had better document at the top of RangeTblEntry why we are using a
custom function.
--
Michael
			
		Вложения
Re: Michael Paquier
> This is OK on its own, still feels a bit incomplete, as the relid also
> includes an assumption about the namespace.  I would suggested to add
> a hardcoded "pg_temp" here, to keep track of this assumption, at
> least.
I had thought about it, but figured that integers and strings are
already separate namespaces, so hashing them shouldn't have any
conflicts. But it's more clear to do that, so added in the new
version:
       AppendJumble(jstate, (const unsigned char *)"pg_temp", sizeof("pg_temp"));
       AppendJumble(jstate, (const unsigned char *)rel_name, strlen(rel_name));
>  typedef struct RangeTblEntry
>  {
> -    pg_node_attr(custom_read_write)
> +    pg_node_attr(custom_read_write, custom_query_jumble)
> 
> This structure still includes some query_jumble_ignore, which are not
> required once custom_query_jumble is added.
I would tend to keep them for documentation purposes. (The other
custom_query_jumble functions have a much more explicit structure so
there it is clear which fields are supposed to be jumbled.)
> We had better document at the top of RangeTblEntry why we are using a
> custom function.
I added a short comment just above custom_query_jumble.
Christoph
			
		Вложения
On Tue, Mar 18, 2025 at 05:51:54PM +0100, Christoph Berg wrote:
> I had thought about it, but figured that integers and strings are
> already separate namespaces, so hashing them shouldn't have any
> conflicts. But it's more clear to do that, so added in the new
> version:
>
>        AppendJumble(jstate, (const unsigned char *)"pg_temp", sizeof("pg_temp"));
>        AppendJumble(jstate, (const unsigned char *)rel_name, strlen(rel_name));
Yes, I know that it's not a big deal, but it could be confusing if
somebody makes an argument about jumbling more object names for
relations.  The problem is not only limited to relations, though, as
there are other object types that can use a temp namespace like
functions, but the case of table entries should cover most of the
complaints I can imagine.
> >  typedef struct RangeTblEntry
> >  {
> > -    pg_node_attr(custom_read_write)
> > +    pg_node_attr(custom_read_write, custom_query_jumble)
> >
> > This structure still includes some query_jumble_ignore, which are not
> > required once custom_query_jumble is added.
>
> I would tend to keep them for documentation purposes. (The other
> custom_query_jumble functions have a much more explicit structure so
> there it is clear which fields are supposed to be jumbled.)
Fine by me as well to keep a dependency based on the fact that the
structure is rather complicated, but I'd rather document that as well
in parsenodes.h with a slightly fatter comment.  What do you think?
--
Michael
			
		Вложения
Re: Michael Paquier > Fine by me as well to keep a dependency based on the fact that the > structure is rather complicated, but I'd rather document that as well > in parsenodes.h with a slightly fatter comment. What do you think? You are of course right, that one-line comment was just snakeoil :D. Now there are proper ones, thanks. Christoph
Вложения
On Wed, Mar 19, 2025 at 01:02:54PM +0100, Christoph Berg wrote:
> You are of course right, that one-line comment was just snakeoil :D.
> Now there are proper ones, thanks.
I have been thinking about this patch for a couple of days.  What
makes me unhappy in this proposal is that RangeTblEntry is a large and
complicated Node, and we only want to force a custom computation for
the "relid" portion of the node, while being able to also take some
decisions for what the parent node has.  Leaving the existing
per-field query_jumble_ignore with the custom function is prone to
errors, as well, because we need to maintain some level of consistency
between parsenodes.h and src/backend/nodes/.
Hence here is a counter-proposal, that can bring the best of both
worlds: let's add support for custom_query_jumble at field level.
I've spent some time on that, and some concatenation in a macro used
by gen_node_support.pl to force a policy for the custom function name
and its arguments is proving to be rather straight-forward.  This
approach addresses the problem of this thread, while also tackling my
concerns about complex node structures.
The custom functions are named _jumble${node}_${field}, with the field
and the parent node given as arguments.  I agree that giving the field
is kind of pointless if you have the parent node, but I think that
this is better so as this forces developers to think about how to use
the field value with the node.
Please see the attached.  What do you think?
--
Michael
			
		Вложения
Re: Michael Paquier
> I have been thinking about this patch for a couple of days.  What
> makes me unhappy in this proposal is that RangeTblEntry is a large and
> complicated Node, and we only want to force a custom computation for
> the "relid" portion of the node, while being able to also take some
> decisions for what the parent node has.  Leaving the existing
> per-field query_jumble_ignore with the custom function is prone to
> errors, as well, because we need to maintain some level of consistency
> between parsenodes.h and src/backend/nodes/.
Ack, that was also bothering me, but I didn't think it was so easy to
do it on a per-field level. Thanks!
> The custom functions are named _jumble${node}_${field}, with the field
> and the parent node given as arguments.  I agree that giving the field
> is kind of pointless if you have the parent node, but I think that
> this is better so as this forces developers to think about how to use
> the field value with the node.
Makes sense.
> Please see the attached.  What do you think?
Just one minor thing, I don't understand what you are trying to say in
this comment:
> +/*
> + * Note that the argument types are enforced for the per-field custom
> + * functions.
> + */
> +#define JUMBLE_CUSTOM(nodetype, item) \
> +    _jumble##nodetype##_##item(jstate, expr, expr->item)
Thanks,
Christoph
			
		On Fri, Mar 21, 2025 at 05:26:20PM +0100, Christoph Berg wrote:
> Just one minor thing, I don't understand what you are trying to say in
> this comment:
>
>> +/*
>> + * Note that the argument types are enforced for the per-field custom
>> + * functions.
>> + */
>> +#define JUMBLE_CUSTOM(nodetype, item) \
>> +    _jumble##nodetype##_##item(jstate, expr, expr->item)
In this one, I want to mean that we require a custom per-field
function to look like that:
_jumbleNodefoo_field(JumbleState *jstate, NodeFoo *expr, FieldType field);
Rather than having more generic shape like that:
_jumbleNodefoo_field(JumbleState *jstate, Node *exp,
                     const unsigned char *item, Size size);
So a custom function is defined so as the node type and field type are
arguments.  Perhaps this comment would be better if reworded like
that:
"The arguments of this function use the node type and the field type,
rather than a generic argument like AppendJumble() and the other
_jumble() functions."
If you have a better idea, please feel free..
--
Michael
			
		Вложения
Re: Michael Paquier > >> + * Note that the argument types are enforced for the per-field custom > >> + * functions. > >> + */ > >> +#define JUMBLE_CUSTOM(nodetype, item) \ > >> + _jumble##nodetype##_##item(jstate, expr, expr->item) > > In this one, I want to mean that we require a custom per-field > function to look like that: > _jumbleNodefoo_field(JumbleState *jstate, NodeFoo *expr, FieldType field); > > Rather than having more generic shape like that: > _jumbleNodefoo_field(JumbleState *jstate, Node *exp, > const unsigned char *item, Size size); > > So a custom function is defined so as the node type and field type are > arguments. Perhaps this comment would be better if reworded like > that: > "The arguments of this function use the node type and the field type, > rather than a generic argument like AppendJumble() and the other > _jumble() functions." Perhaps this: /* * The per-field custom jumble functions get jstate, the node, and the * field as arguments. */ They are not actually different from _jumbleList and _jumbleA_Const which also get the node (and just not the field). AppendJumble is a different layer, the output, so it's not surprising its signature is different. Are we at the point where the patch is already Ready for Committer? Thanks, Christoph
Re: To Michael Paquier > > >> +#define JUMBLE_CUSTOM(nodetype, item) \ > > >> + _jumble##nodetype##_##item(jstate, expr, expr->item) > > > > In this one, I want to mean that we require a custom per-field > > function to look like that: > > _jumbleNodefoo_field(JumbleState *jstate, NodeFoo *expr, FieldType field); > > Perhaps this: Or actually more explicit: /* * Per-field custom jumble functions have this signature: * _jumbleNodefoo_field(JumbleState *jstate, NodeFoo *expr, FieldType field); */ Christoph
On Sat, Mar 22, 2025 at 10:43:00AM +0100, Christoph Berg wrote: > Are we at the point where the patch is already Ready for Committer? I'll think a bit more about how to document all that. Anyway, yes, I'm OK with the per-field custom_query_jumble, so let's move on with that, so I will do something about that. -- Michael
Вложения
Michael Paquier <michael@paquier.xyz> writes:
> On Sat, Mar 22, 2025 at 10:43:00AM +0100, Christoph Berg wrote:
>> Are we at the point where the patch is already Ready for Committer?
> I'll think a bit more about how to document all that.  Anyway, yes,
> I'm OK with the per-field custom_query_jumble, so let's move on with
> that, so I will do something about that.
I'm not terribly happy with the entire proposal.
(1) I know it was asserted upthread that there was no performance
impact, but I find that hard to believe.
(2) This patch inserts catalog lookups into query ID computation,
which AFAIK there never were before.  This means you can't compute a
query ID outside a transaction or in an already-failed transaction.
Surely that's going to bite us eventually.
(3) I think having query jumbling work differently for temp tables
than other tables is a fundamentally bad idea.
So my feeling is: if we think this is the behavior we want, let's do
it across the board.  I suggest that we simply drop the relid from the
jumble and use the table alias (that is, eref->aliasname) instead.
ISTM this fits well with the general trend in pg_stat_statements
to merge statements together more aggressively than the original
concept envisioned.
            regards, tom lane
			
		I wrote:
> So my feeling is: if we think this is the behavior we want, let's do
> it across the board.  I suggest that we simply drop the relid from the
> jumble and use the table alias (that is, eref->aliasname) instead.
I experimented with this trivial fix (shown in-line to keep the cfbot
from thinking this is the patch-of-record):
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 23c9e3c5abf..a54bbdc18b7 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -1051,7 +1051,7 @@ typedef struct RangeTblEntry
     /* user-written alias clause, if any */
     Alias       *alias pg_node_attr(query_jumble_ignore);
     /* expanded reference names */
-    Alias       *eref pg_node_attr(query_jumble_ignore);
+    Alias       *eref;
     RTEKind        rtekind;        /* see above */
@@ -1094,7 +1094,7 @@ typedef struct RangeTblEntry
      * tables to be invalidated if the underlying table is altered.
      */
     /* OID of the relation */
-    Oid            relid;
+    Oid            relid pg_node_attr(query_jumble_ignore);
     /* inheritance requested? */
     bool        inh;
     /* relation kind (see pg_class.relkind) */
This caused just one diff in existing regression test cases:
diff --git a/contrib/pg_stat_statements/expected/planning.out b/contrib/pg_stat_statements/expected/planning.out
index 3ee1928cbe9..c25b8b946fd 100644
--- a/contrib/pg_stat_statements/expected/planning.out
+++ b/contrib/pg_stat_statements/expected/planning.out
@@ -75,8 +75,9 @@ SELECT plans >= 2 AND plans <= calls AS plans_ok, calls, rows, query FROM pg_sta
   WHERE query LIKE 'SELECT COUNT%' ORDER BY query COLLATE "C";
  plans_ok | calls | rows |                query
 ----------+-------+------+--------------------------------------
- t        |     4 |    4 | SELECT COUNT(*) FROM stats_plan_test
-(1 row)
+ f        |     1 |    1 | SELECT COUNT(*) FROM stats_plan_test
+ f        |     3 |    3 | SELECT COUNT(*) FROM stats_plan_test
+(2 rows)
 -- Cleanup
 DROP TABLE stats_plan_test;
What's happening there is that there's an ALTER TABLE ADD COLUMN in
the test, so the executions after the first one see more entries
in eref->colnames and come up with a different jumble.  I think
we probably don't want that behavior; we only want to jumble the
table name.  So we'd still need the v3-0001 patch in some form to
allow annotating RangeTblEntry.eref with a custom jumble method
that'd only jumble the aliasname.
            regards, tom lane
			
		On Sat, Mar 22, 2025 at 12:24:43PM -0400, Tom Lane wrote:
> I experimented with this trivial fix (shown in-line to keep the cfbot
> from thinking this is the patch-of-record):
>
> What's happening there is that there's an ALTER TABLE ADD COLUMN in
> the test, so the executions after the first one see more entries
> in eref->colnames and come up with a different jumble.  I think
> we probably don't want that behavior; we only want to jumble the
> table name.  So we'd still need the v3-0001 patch in some form to
> allow annotating RangeTblEntry.eref with a custom jumble method
> that'd only jumble the aliasname.
Alias.aliasname is not qualified, so it means that we'd begin to
assign the same query ID even if using two relations from two schemas
depending on what search_path assigns, no?  Say:
create schema popo1;
create schema popo2;
create table popo1.aa (a int, b int);
create table popo2.aa (a int, b int);
set search_path = 'popo1';
select count(*) from aa;
set search_path = 'popo2';
select count(*) from aa;
=# select query, calls from pg_stat_statements where
     query ~ 'select count';
          query          | calls
-------------------------+-------
 select count(*) from aa |     2
(1 row)
Perhaps that's OK because such queries use the same query string, but
just silencing the relid means that we'd lose the namespace reference
entirely, making the stats potentially fuzzier depending on the
workload.  On HEAD, one can guess the query ID with an EXPLAIN and a
search_path, as well, so currently it's possible to cross-check the
contents of pgss.  But we'd lose this possibility here..
--
Michael
			
		Вложения
Michael Paquier <michael@paquier.xyz> writes:
> Alias.aliasname is not qualified, so it means that we'd begin to
> assign the same query ID even if using two relations from two schemas
> depending on what search_path assigns, no?
Right.  I'm arguing that that's good.  The proposed patch already
obscures the difference between similar table names in different
(temp) schemas, and I'm suggesting that taking that a bit further
would be fine.
Note that if the tables we're considering don't have identical
rowtypes, the queries would likely jumble differently anyway
due to differences in Vars' varattno and vartype.
            regards, tom lane
			
		On Sat, Mar 22, 2025 at 09:04:19PM -0400, Tom Lane wrote: > Right. I'm arguing that that's good. The proposed patch already > obscures the difference between similar table names in different > (temp) schemas, and I'm suggesting that taking that a bit further > would be fine. > > Note that if the tables we're considering don't have identical > rowtypes, the queries would likely jumble differently anyway > due to differences in Vars' varattno and vartype. Not for the types AFAIK, the varattnos count in, but perhaps for the same argument as previously it's just kind of OK? Please see the tests in the attached about that. I've spent a few hours looking at the diffs of a pgss dump before and after the fact. The reduction in the number of entries seem to come mainly from tests where we do a successive creates and drops of the same table name. There are quite a few of them for updatable views, but it's not the only one. A good chunk comes from tables with simpler and rather generic names. So your idea to use the relation name in eref while skipping the column list looks kind of promising. Per se the attached. Thoughts? -- Michael
Вложения
Re: Michael Paquier > So your idea to use the relation name in eref while skipping the > column list looks kind of promising. Per se the attached. Thoughts? Makes sense to me, thanks for digging into it. > +++ b/src/backend/nodes/queryjumblefuncs.c > @@ -33,6 +33,7 @@ > #include "postgres.h" > > #include "access/transam.h" > +#include "catalog/namespace.h" No longer needed. > +++ b/contrib/pg_stat_statements/sql/select.sql > +SET search_path = 'pgss_schema_1'; > +SELECT count(*) FROM tab_search_same; > +SELECT a, b FROM tab_search_same; > +SELECT count(*) FROM tab_search_diff_1; > +SELECT count(*) FROM tab_search_diff_2; > +SELECT a FROM tab_search_diff_2; > +SET search_path = 'pgss_schema_1'; Should this be pgss_schema_2 ? Christoph
> So your idea to use the relation name in eref while skipping the
> column list looks kind of promising.  Per se the attached.  Thoughts?
I feel really uneasy about this behavior becoming the default.
I can bet there are some users which run common queries across
different schemas ( e.g. multi-tenancy ) will consider this behavior a
regression
in pg_stat_statements as now all their common queries have been merged
into a single entry.
For example, I have seen users add comments to SQLs to differentiate
similar SQLs coming from different tenants. This patch makes this no longer a
somewhat decent workaround  to overcome the fact that pg_stat_statements
does not track schemas or search path.
```
select pg_stat_statements_reset();
set search_path = s1;
select /*+ user s1 */ * from foo;
set search_path = s2;
select /*+ user s2 */  * from foo;
reset search_path;
select userid, queryid, query, calls from public.pg_stat_statements;
test=# select userid, queryid, query, calls from public.pg_stat_statements;
 userid |       queryid        |               query               | calls
--------+----------------------+-----------------------------------+-------
     10 |  1788423388555345932 | select /*+ user s1 */ * from foo  |     2
     10 | -8935568138104064674 | select pg_stat_statements_reset() |     1
     10 | -8663970364987885379 | set search_path = $1              |     2
     10 | -6563543739552933350 | reset search_path                 |     1
(4 rows)
```
--
Sami Imseih
Amazon Web Services (AWS)
			
		Sami Imseih <samimseih@gmail.com> writes:
> For example, I have seen users add comments to SQLs to differentiate
> similar SQLs coming from different tenants. This patch makes this no longer a
> somewhat decent workaround  to overcome the fact that pg_stat_statements
> does not track schemas or search path.
Well, the workaround is different, but that doesn't mean there is no
workaround.  You just need to alter a table alias in the query:
select * from foo s1;
select * from foo s2;
As against this, there is probably also a set of people who would
*like* identical queries on identically-rowtyped tables in different
schemas to be merged.  Right now they have no way to make that happen.
So yeah, it's a nontrivial behavioral change.  But I think on the
whole it's likely to be more useful.  We could always revert the
change during beta if we get too much pushback.
            regards, tom lane
			
		> select * from foo s1; > select * from foo s2; ah, thanks for pointing this out. Not as good of a workaround as a comment since one must change aliases, but at least there is a workaround... > As against this, there is probably also a set of people who would > *like* identical queries on identically-rowtyped tables in different > schemas to be merged. Right now they have no way to make that happen. I agree that some may want stats to merge for the same tables, and others may not. I will suggest this with some hesitation, but why not make this behavior configurable via a GUC? We recently introduced query_id_squash_values for controlling the merge of an IN list, maybe this is another queryId behavior we should provide a configuration for? -- Sami Imseih Amazon Web Services (AWS)
Sami Imseih <samimseih@gmail.com> writes:
> I agree that some may want stats to merge for the same tables,
> and others may not. I will suggest this with some hesitation, but why not
> make this behavior configurable via a GUC?
> We recently introduced query_id_squash_values for controlling
> the merge of an IN list, maybe this is another queryId behavior we should
> provide a configuration for?
I don't like that GUC and I would not like this one either.  We
learned years ago that GUCs that change query semantics are a bad
idea, but apparently now we have hackers who need to relearn that
lesson the hard way.  (Admittedly, this isn't quite *query* semantics,
which perhaps lessens the blast radius.  But I think we're still going
to regret query_id_squash_values.)
            regards, tom lane
			
		> Sami Imseih <samimseih@gmail.com> writes: > > I agree that some may want stats to merge for the same tables, > > and others may not. I will suggest this with some hesitation, but why not > > make this behavior configurable via a GUC? > > We recently introduced query_id_squash_values for controlling > > the merge of an IN list, maybe this is another queryId behavior we should > > provide a configuration for? > > I don't like that GUC and I would not like this one either. We > learned years ago that GUCs that change query semantics are a bad > idea, but apparently now we have hackers who need to relearn that > lesson the hard way. (Admittedly, this isn't quite *query* semantics, > which perhaps lessens the blast radius. But I think we're still going > to regret query_id_squash_values.) query_id_squash_values has a much weaker argument to exist than a guc to control the use of alias vs OID. Why would anyone not want to squash the IN list? maybe we should revisit this decision in that thread. With that said, if everyone is OK with the behavior change of pg_s_s with this patch, I will concede. FWIW, I do think this for most cases, the proposed behavior is desirable. I just worry about the users who rely on the OID being jumbled behavior, and have no way to revery. -- Sami Imseih Amazon Web Services (AWS)
On Mon, Mar 24, 2025 at 04:41:35PM +0100, Christoph Berg wrote: > Re: Michael Paquier >> +++ b/src/backend/nodes/queryjumblefuncs.c >> @@ -33,6 +33,7 @@ >> #include "postgres.h" >> >> #include "access/transam.h" >> +#include "catalog/namespace.h" > > No longer needed. Indeed. >> +++ b/contrib/pg_stat_statements/sql/select.sql >> +SET search_path = 'pgss_schema_1'; >> +SELECT count(*) FROM tab_search_same; >> +SELECT a, b FROM tab_search_same; >> +SELECT count(*) FROM tab_search_diff_1; >> +SELECT count(*) FROM tab_search_diff_2; >> +SELECT a FROM tab_search_diff_2; >> +SET search_path = 'pgss_schema_1'; > > Should this be pgss_schema_2 ? Oops. Nice copy-paste. -- Michael
Вложения
On Mon, Mar 24, 2025 at 09:38:35PM -0500, Sami Imseih wrote: > > select * from foo s1; > > select * from foo s2; > > ah, thanks for pointing this out. Not as good of a workaround as > a comment since one must change aliases, but at least there is > a workaround... Exactly. Like Tom I'm not really worried about the proposal, but of course I could prove to be wrong. I am ready to assume that bloat in pgss entries caused by temp tables is a more common case. So let's move on with this proposal, before the gong. -- Michael
Вложения
On Mon, Mar 24, 2025 at 10:30:59PM -0500, Sami Imseih wrote: >> Sami Imseih <samimseih@gmail.com> writes: >>> I agree that some may want stats to merge for the same tables, >>> and others may not. I will suggest this with some hesitation, but why not >>> make this behavior configurable via a GUC? >>> We recently introduced query_id_squash_values for controlling >>> the merge of an IN list, maybe this is another queryId behavior we should >>> provide a configuration for? >> >> I don't like that GUC and I would not like this one either. We >> learned years ago that GUCs that change query semantics are a bad >> idea, but apparently now we have hackers who need to relearn that >> lesson the hard way. (Admittedly, this isn't quite *query* semantics, >> which perhaps lessens the blast radius. But I think we're still going >> to regret query_id_squash_values.) > > query_id_squash_values has a much weaker argument to exist than a > guc to control the use of alias vs OID. Why would anyone not want > to squash the IN list? maybe we should revisit this decision in that thread. This part of the thread is digressing, but I'd on the side of removing entirely the GUC and make the grouping of IN values the default. We still have time to discuss that during the beta cycle, so let's do so on its related thread. -- Michael
Вложения
> Exactly. Like Tom I'm not really worried about the proposal, but of > course I could prove to be wrong. I am ready to assume that bloat in > pgss entries caused by temp tables is a more common case. I ran installcheck-parallel with the patch and without the patch 3x, and the reduction in pg_s_s bloat is visible. The same entries are reused even when the regression database is recreated, as expected. ## without patch run installcheck-parallel 3 times postgres=# select count(distinct queryid), count(*) from pg_stat_statements; count | count -------+------- 41630 | 74776 (1 row) ## with patch run installcheck-parallel 3 times postgres=# select count(distinct queryid), count(*) from pg_stat_statements; count | count -------+------- 26301 | 73030 (1 row) >This part of the thread is digressing, but I'd on the side of removing > entirely the GUC and make the grouping of IN values the default. We > still have time to discuss that during the beta cycle, so let's do so > on its related thread. I will do that tomorrow in that thread. It would not make sense to introduce a GUC for that behavior and not for this. So, I am glad we discussed this here. Looking at the patches, I could not find anything else besides what was pointed out by Christoph earlier. -- Sami Imseih Amazon Web Services (AWS)
Sami Imseih <samimseih@gmail.com> writes:
>> I don't like that GUC and I would not like this one either.  We
>> learned years ago that GUCs that change query semantics are a bad
>> idea, but apparently now we have hackers who need to relearn that
>> lesson the hard way.
> query_id_squash_values has a much weaker argument to exist than a
> guc to control the use of alias vs OID. Why would anyone not want
> to squash the IN list? maybe we should revisit this decision in that thread.
I'm not opining one way or the other on whether squashing an IN list
is desirable.  My point is that a GUC is a poor way to make --- or
really, avoid making --- such decisions.  The reasons we took away
from previous experiments with semantics-determing GUCs were:
1. The scope of effect of a GUC is seldom what you want for such
things.  There are going to be some queries in which you want behavior
A, and some in which you want behavior B, and in the worst case you
want different behaviors in different parts of the same query.  It's
really painful to make that happen.
2. Tools that are not entitled to set the value of the GUC are forced
to be prepared to cope with any setting.  That can be anywhere from
painful to impossible.
For the specific context of controlling how query jumbling happens,
there's still another problem: the actual statement-merging behavior of
pg_stat_statements will depend on the totality of settings of the GUC
across the entire system.  It's not very clear to me what will happen
if different sessions use different settings, much less if people
change the setting intra-session; but I bet a lot of people will find
it surprising.  62d712ecf did no one any favors by marking that GUC
USERSET rather than something that would prevail system-wide.
All of these remarks apply with equal force to anything that changes
the behavior of query-jumbling, no matter the specific details.
            regards, tom lane
			
		Hi, On Tue, Mar 25, 2025 at 12:32:05AM -0400, Tom Lane wrote: > > I'm not opining one way or the other on whether squashing an IN list > is desirable. My point is that a GUC is a poor way to make --- or > really, avoid making --- such decisions. The reasons we took away > from previous experiments with semantics-determing GUCs were: > > 1. The scope of effect of a GUC is seldom what you want for such > things. There are going to be some queries in which you want behavior > A, and some in which you want behavior B, and in the worst case you > want different behaviors in different parts of the same query. It's > really painful to make that happen. > > 2. Tools that are not entitled to set the value of the GUC are forced > to be prepared to cope with any setting. That can be anywhere from > painful to impossible. Didn't that ship already sailed in pg14 when we allowed generating custom query_id? Now: > For the specific context of controlling how query jumbling happens, > there's still another problem: the actual statement-merging behavior of > pg_stat_statements will depend on the totality of settings of the GUC > across the entire system. It's not very clear to me what will happen > if different sessions use different settings, much less if people > change the setting intra-session; but I bet a lot of people will find > it surprising. 62d712ecf did no one any favors by marking that GUC > USERSET rather than something that would prevail system-wide. One of the requirement for the custom query_id was that you shouldn't be allowed to change the algorithm dynamically, at least not unless a superuser agrees to maybe break everything, which is why compute_query_id is marked as PGC_SUSET. I think that the same reasoning should apply here and if the GUC is kept it should be at least at that level.
Julien Rouhaud <rjuju123@gmail.com> writes:
> On Tue, Mar 25, 2025 at 12:32:05AM -0400, Tom Lane wrote:
>> 2. Tools that are not entitled to set the value of the GUC are forced
>> to be prepared to cope with any setting.  That can be anywhere from
>> painful to impossible.
> Didn't that ship already sailed in pg14 when we allowed generating custom
> query_id?
Up to a point, perhaps.  If I'm writing some kind of tool that digests
pg_stat_statements results, I think I'm entitled to disregard the
possibility that somebody is using a custom query_id that behaves in
ways I'm not expecting --- or at least, fixing my code for that is
their problem not mine.  But it's much harder to take that attitude
for things that are built into core PG.
>> For the specific context of controlling how query jumbling happens,
>> there's still another problem: the actual statement-merging behavior of
>> pg_stat_statements will depend on the totality of settings of the GUC
>> across the entire system.
> One of the requirement for the custom query_id was that you shouldn't be
> allowed to change the algorithm dynamically, at least not unless a superuser
> agrees to maybe break everything, which is why compute_query_id is marked as
> PGC_SUSET.  I think that the same reasoning should apply here and if the GUC is
> kept it should be at least at that level.
Fully agreed.  (Even SUSET is debatable in this situation, but I'm okay
with it on the principle that superusers are expected to know what
they're doing and accept the consequences.)
            regards, tom lane
			
		On Tue, Mar 25, 2025 at 01:17:22AM -0400, Tom Lane wrote: > Julien Rouhaud <rjuju123@gmail.com> writes: > > On Tue, Mar 25, 2025 at 12:32:05AM -0400, Tom Lane wrote: > >> 2. Tools that are not entitled to set the value of the GUC are forced > >> to be prepared to cope with any setting. That can be anywhere from > >> painful to impossible. > > > Didn't that ship already sailed in pg14 when we allowed generating custom > > query_id? > > Up to a point, perhaps. If I'm writing some kind of tool that digests > pg_stat_statements results, I think I'm entitled to disregard the > possibility that somebody is using a custom query_id that behaves in > ways I'm not expecting --- or at least, fixing my code for that is > their problem not mine. But it's much harder to take that attitude > for things that are built into core PG. I see, that's fair.
On Mon, Mar 24, 2025 at 8:51 PM Michael Paquier <michael@paquier.xyz> wrote:
On Mon, Mar 24, 2025 at 09:38:35PM -0500, Sami Imseih wrote:
> > select * from foo s1;
> > select * from foo s2;
>
> ah, thanks for pointing this out. Not as good of a workaround as
> a comment since one must change aliases, but at least there is
> a workaround...
Exactly. Like Tom I'm not really worried about the proposal, but of
course I could prove to be wrong. I am ready to assume that bloat in
pgss entries caused by temp tables is a more common case.
For what its worth, +1 on the current proposal in this thread (and doing it without a GUC), i.e. merging a query that references the same table alias, ignoring different schemas.
In the context of the earlier referenced one-schema-per-customer workloads:
In my experience these often not work well with pg_stat_statements today because of their own bloat problem, just like with temp tables. You quickly have way too many unique entries, and your query text file accumulates a lot of duplicative entries (since the same query text gets repeated in the text file, since its queryid is different), to the point that you can't monitor your workload at all anymore.
Thanks,
Lukas
Lukas Fittl
On Mon, Mar 24, 2025 at 11:09:06PM -0700, Lukas Fittl wrote: > For what its worth, +1 on the current proposal in this thread (and doing it > without a GUC), i.e. merging a query that references the same table alias, > ignoring different schemas. Thanks for the feedback. I have looked again at the first patch to add custom_query_jumble as a node field attribute, adjusted some comments, and applied it as 5ac462e2b7ac. Attached is the second one, with more tests coverage with attribute aliases (these being ignored exists in stable branches, but why not while on it) and table aliases, and the fixes for the issues pointed out by Christoph. I'll double-check all that again tomorrow. Please find an updated version attached for now. -- Michael
Вложения
> In my experience these often not work well with pg_stat_statements today because > of their own bloat problem, just like with temp tables. You quickly have way too many > unique entries, and your query text file accumulates a lot of duplicative entries > (since the same query text gets repeated in the text file, since its queryid is different), > to the point that you can't monitor your workload at all anymore. That is true in terms of pg_stat_statements, and I have seen users have to increase pg_stat_statements.max to something much higher to avoid the bloat and constant deallocs/GC. But, besides pg_stat_statements, queryId is used to group queries for database load monitoring ( pg_stat_activity sampling). As of now, different schemas are tracked separately, but with this change they will be merged. This may come as a surprise to use-cases that rely on the existing behavior. But I do agree that pg_s_s bloat is a big pain point, so this change should be positive overall. Let's see if there are enough complaints to force us to reconsider. -- Sami Imseih Amazon Web Services (AwS)
Re: Michael Paquier > Attached is the second one, with more tests coverage with attribute > aliases (these being ignored exists in stable branches, but why not > while on it) and table aliases, and the fixes for the issues pointed > out by Christoph. I'll double-check all that again tomorrow. Please > find an updated version attached for now. Looks good to me. Christoph
> > Attached is the second one, with more tests coverage with attribute
> > aliases (these being ignored exists in stable branches, but why not
> > while on it) and table aliases, and the fixes for the issues pointed
> > out by Christoph.  I'll double-check all that again tomorrow. Please
> > find an updated version attached for now.
There are several parts of the doc that may no longer hold true.
1/
"Since the queryid hash value is computed on the post-parse-analysis
representation of the queries, the opposite is also possible: queries with
identical texts might appear as separate entries, if they have different
meanings as a result of factors such as different search_path settings."
I think this text could remain as is, because search_path still
matters for things like functions, etc.
"""
postgres=# SET SEARCH_PATH=a;
SET
postgres=# explain verbose select * from test();
                QUERY PLAN
------------------------------------------
 Result  (cost=0.00..0.01 rows=1 width=4)
   Output: 1
 Query Identifier: -1813735303617154554
(3 rows)
postgres=# SET SEARCH_PATH=b;
SET
postgres=# explain verbose select * from test();
                QUERY PLAN
------------------------------------------
 Result  (cost=0.00..0.01 rows=1 width=4)
   Output: 1
 Query Identifier: -3896107319863686763
(3 rows)
"""
2/
"For example, pg_stat_statements will consider two apparently-identical
queries to be distinct, if they reference a table that was dropped and
recreated between the executions of the two queries."
This is no longer true for relations, but is still true for functions. I think
we should mention the caveats in a bit more detail as this change
will have impact on the most common case. What about something
like this?
"For example, pg_stat_statements will consider two apparently-identical
queries to be distinct, if they reference a function that was dropped and
recreated between the executions of the two queries. Conversely, if a table is
dropped and recreated between the executions of queries, two
apparently-identical
queries will be considered the same. However, if the alias for a table is
different for semantically similar queries, these queries will be
considered distinct"
--
Sami Imseih
Amazon Web Services (AWS)
			
		On Tue, Mar 25, 2025 at 11:58:21AM -0500, Sami Imseih wrote: > "Since the queryid hash value is computed on the post-parse-analysis > representation of the queries, the opposite is also possible: queries with > identical texts might appear as separate entries, if they have different > meanings as a result of factors such as different search_path settings." > > I think this text could remain as is, because search_path still > matters for things like functions, etc. Yeah, I think that's OK as-is. I'm open to more improvements, including more tests for these function patterns. It's one of these areas where we should be able to tweak RangeTblFunction and apply a custom function to its funcexpr, and please note that I have no idea how complex it could become as this is a Node expression. :D Functions in a temporary schema is not something as common as temp tables, I guess, so these matter less, but they would still be a cause of bloat for monitoring in very specific workloads. > 2/ > "For example, pg_stat_statements will consider two apparently-identical > queries to be distinct, if they reference a table that was dropped and > recreated between the executions of the two queries." > > This is no longer true for relations, but is still true for functions. I think > we should mention the caveats in a bit more detail as this change > will have impact on the most common case. What about something > like this? > > "For example, pg_stat_statements will consider two apparently-identical > queries to be distinct, if they reference a function that was dropped and > recreated between the executions of the two queries. That's a bit larger than functions, but we could remain a bit more evasive, with "if they reference *for example* a function that was dropped and recreated between the executions of the two queries". Note that for DDLs, like CREATE TABLE, we also group entries with identical relation names, so we are kind of in line with the patch, not with the current docs. > Conversely, if a table is dropped and recreated between the > executions of queries, two apparently-identical queries may be > considered the same. However, if the alias for a table is different > for semantically similar queries, these queries will be considered distinct" This addition sounds like an improvement here. As this thread has proved, we had little coverage these cases in pgss, so I've applied the tests as an independent change. It is also useful to track how things change in the commit history depending on how the computation is tweaked. I've also included your doc suggestions. I feel that we could do better here, but that's a common statement anyway when it comes to the documentation. -- Michael
Вложения
Michael Paquier <michael@paquier.xyz> writes:
> [ v6-0001-Add-custom-query-jumble-function-for-RangeTblEntr.patch ]
Couple of minor review comments ...
* In 5ac462e2b, this bit:
        # node type
-       if (($t =~ /^(\w+)\*$/ or $t =~ /^struct\s+(\w+)\*$/)
+       if ($query_jumble_custom)
+       {
+           # Custom function that applies to one field of a node.
+           print $jff "\tJUMBLE_CUSTOM($n, $f);\n";
+       }
+       elsif (($t =~ /^(\w+)\*$/ or $t =~ /^struct\s+(\w+)\*$/)
fails to honor $query_jumble_ignore as the other if-branches do.
Perhaps it's unlikely that a node would have both query_jumble_custom
and query_jumble_ignore annotations, but still, the script would emit
completely incorrect code if it did.  Also, the "# node type" comment
should probably be moved down to within the first "elsif" block.
* In the v6 patch, this doesn't read very smoothly:
+     * Expanded reference names.  This uses a custom query jumble function so
+     * as the table name is included in the computation, not its list of
+     * columns.
Perhaps better
+     * Expanded reference names.  This uses a custom query jumble function so
+     * that the table name is included in the computation, but not its list of
+     * columns.
* Also, here:
+   considered the same. However, if the alias for a table is different
+   for semantically similar queries, these queries will be considered
+   distinct.
I'd change "semantically similar queries" to "otherwise-similar
queries"; I think writing "semantically" will just confuse people.
Otherwise LGTM.
            regards, tom lane
			
		On Tue, Mar 25, 2025 at 07:24:20PM -0400, Tom Lane wrote: > fails to honor $query_jumble_ignore as the other if-branches do. > Perhaps it's unlikely that a node would have both query_jumble_custom > and query_jumble_ignore annotations, but still, the script would emit > completely incorrect code if it did. Also, the "# node type" comment > should probably be moved down to within the first "elsif" block. Oops, sorry about that. Fixed both of these in 27ee6ede6bc9. > I'd change "semantically similar queries" to "otherwise-similar > queries"; I think writing "semantically" will just confuse people. If I get the difference right, semantically would apply to concepts related to linguistics, but that's not what we have here, so you are using a more general term. Thanks for the suggestions. -- Michael
Вложения
> If I get the difference right, semantically would apply to concepts > related to linguistics, but that's not what we have here, so you are > using a more general term. FWIW, the pg_stat_statements docs in a few places refer to queries that may look different but have the same meaning as “semantically equivalent”, this is why I used the same terminology here. But, I have no issue with the simplified rewrite either. The patch LGTM as well. Regards, Sami Imseih Amazon Web Services (AWS)
On Tue, Mar 25, 2025 at 07:56:29PM -0500, Sami Imseih wrote: > FWIW, the pg_stat_statements docs in a few places refer to > queries that may look different but have the same meaning > as “semantically equivalent”, this is why I used the same > terminology here. But, I have no issue with the simplified > rewrite either. > > The patch LGTM as well. If any adjustments are required, it would be always possible to do these later. Anyway, workloads with a lot of temporary tables are going to benefit from this change, so let's see how it goes. Applied, after eyeing much more the PGSS dumps from installcheck before and after the patch, to make sure that I'm not missing something.. -- Michael
Вложения
Re: query_id: jumble names of temp tables for better pg_stat_statement UX
От
 
		    	Alexander Kukushkin
		    Дата:
		        Hi,
I totally understand the wish to make pg_stat_statements useful for workloads that create/drop a ton of temporary tables.
However, when pursuing this goal we impacted other types of totally valid workloads when tables with the same name exist in different schemas.
Example:
create schema s1;
create table s1.t as select id from generate_series(1, 10) as id;
create schema s2;
create table s1.t as select id from generate_series(1, 1000000) as id;
create table s1.t as select id from generate_series(1, 1000000) as id;
select count(id) from s1.t;
select count(id) from s2.t;
select * from pg_stat_statements;
userid | 10
dbid | 5
toplevel | t
queryid | -8317141500049987426
query | select count(id) from s1.t
plans | 0
total_plan_time | 0
min_plan_time | 0
max_plan_time | 0
mean_plan_time | 0
stddev_plan_time | 0
calls | 2
total_exec_time | 22.577107
min_exec_time | 0.325021
max_exec_time | 22.252086000000002
mean_exec_time | 11.2885535
stddev_exec_time | 10.963532500000001
rows | 2
shared_blks_hit | 4425
userid | 10
dbid | 5
toplevel | t
queryid | -8317141500049987426
query | select count(id) from s1.t
plans | 0
total_plan_time | 0
min_plan_time | 0
max_plan_time | 0
mean_plan_time | 0
stddev_plan_time | 0
calls | 2
total_exec_time | 22.577107
min_exec_time | 0.325021
max_exec_time | 22.252086000000002
mean_exec_time | 11.2885535
stddev_exec_time | 10.963532500000001
rows | 2
shared_blks_hit | 4425
That is, two different queries, accessing two absolutely different tables (one of them has 100000 times more rows!) were merged together.
Regards,
--
Alexander Kukushkin
On Tue, Jul 15, 2025 at 04:48:05PM +0200, Alexander Kukushkin wrote: > I totally understand the wish to make pg_stat_statements useful for > workloads that create/drop a ton of temporary tables. > However, when pursuing this goal we impacted other types of totally valid > workloads when tables with the same name exist in different schemas. > Example: > create schema s1; > create table s1.t as select id from generate_series(1, 10) as id; > create schema s2; > create table s1.t as select id from generate_series(1, 1000000) as id; I suspect that you mean s2.t and not s1.t here. > select count(id) from s1.t; > select count(id) from s2.t; > > That is, two different queries, accessing two absolutely different tables > (one of them has 100000 times more rows!) were merged together. Yes, we had this argument upthread, and it is still possible to differentiate both cases by using a different alias in the FROM clause, as of: select count(id) from s1.t as t1; select count(id) from s2.t as t2; The new behavior where we do not need to worry about temporary tables, which is not that uncommon because some workloads like using these for JOIN patterns as a "temporary" anchor in a session, has more benefits IMO, particularly more if the connections have a rather higher turnover. -- Michael
Вложения
Re: query_id: jumble names of temp tables for better pg_stat_statement UX
От
 
		    	Alexander Kukushkin
		    Дата:
		        On Wed, 16 Jul 2025 at 01:39, Michael Paquier <michael@paquier.xyz> wrote:
> create schema s1;
> create table s1.t as select id from generate_series(1, 10) as id;
> create schema s2;
> create table s1.t as select id from generate_series(1, 1000000) as id;
I suspect that you mean s2.t and not s1.t here.
Yes. 
Yes, we had this argument upthread, and it is still possible to
differentiate both cases by using a different alias in the FROM
clause, as of:
select count(id) from s1.t as t1;
select count(id) from s2.t as t2;
The new behavior where we do not need to worry about temporary tables,
which is not that uncommon because some workloads like using these for
JOIN patterns as a "temporary" anchor in a session, has more benefits
IMO, particularly more if the connections have a rather higher
turnover.
Yes, I've seen this argument and know that aliases will make these queries look different.
However, we regularly hear from many different customers that they *don't control queries* sent by application or *can't modify these queries*.
Such kinds of workloads are also not that uncommon and this change makes it impossible to monitor them.
I would somewhat understand when a table in the query is used without specifying schema and such queries are merged together:
s1: SET search_path s1; select count(*) from t;
s2: SET search_path s2; select count(*) from t;
But, even this case doesn't feel right, because these tables are still different and therefore queries.
Regards,
--
Alexander Kukushkin
On Tue, Jul 15, 2025 at 11:20 PM Alexander Kukushkin <cyberdemn@gmail.com> wrote:
However, we regularly hear from many different customers that they *don't control queries* sent by application or *can't modify these queries*.Such kinds of workloads are also not that uncommon and this change makes it impossible to monitor them.
For the workloads you are thinking of, are these "one customer per schema" multi-tenant workloads, or something else?
I mentioned this earlier in the discussion (when supporting the change that was done), but the main challenge I've seen is that for "one customer per schema" workloads, pg_stat_statements just doesn't work today, unless you have only a handful of customers on a server.
Once you have anything close to 100 or more customer schemas on a server, the churn on the entries makes pg_stat_statements unusable (even with a high max), especially with the current way the query text file works, since you can't reliably read from pg_stat_statements anymore without incurring a read from a 100MB+ query text file.
So I agree this change reduces the visibility into which of the schemas had a slow query, but it at least allows reliably using pg_stat_statements to narrow down which query / part of an application is problematic. To get specifics on the schema, one could then use other means (e.g. log_min_duration_statement, auto_explain, etc) to get exact details, grepping the logfile for the query ID retrieved from pg_stat_statements.
Thanks,
Lukas
Lukas Fittl