Обсуждение: pg_stat_get_backend_subxact() and backend IDs?
Hi
I was playing around with "pg_stat_get_backend_subxact()" (commit 10ea0f924)
and see it emits NULL values for some backends, e.g.:
    postgres=# \pset null NULL
    Null display is "NULL".
    postgres=#  SELECT id, pg_stat_get_backend_pid(id), s.*,
                       pg_stat_get_backend_activity (id)
                  FROM pg_stat_get_backend_idset() id
               JOIN LATERAL pg_stat_get_backend_subxact(id) AS s ON TRUE;
     id  | pg_stat_get_backend_pid | subxact_count |
subxact_overflowed |                pg_stat_get_backend_activity
-----+-------------------------+---------------+--------------------+------------------------------------------------------------
       1 |                 3175972 |             0 | f
 | <command string not enabled>
       2 |                 3175973 |             0 | f
 | <command string not enabled>
       3 |                 3177889 |             0 | f
 | SELECT id, pg_stat_get_backend_pid(id), s.*,              +
         |                         |               |
 |         pg_stat_get_backend_activity (id)                 +
         |                         |               |
 |    FROM pg_stat_get_backend_idset() id                    +
         |                         |               |
 | JOIN LATERAL pg_stat_get_backend_subxact(id) AS s ON TRUE;
       4 |                 3176027 |             5 | f
 | savepoint s4;
     256 |                 3175969 |          NULL | NULL
 | <command string not enabled>
     258 |                 3175968 |          NULL | NULL
 | <command string not enabled>
     259 |                 3175971 |          NULL | NULL
 | <command string not enabled>
    (7 rows)
Reading through the thread [1], it looks like 0/false are intended to be
returned for non-backend processes too [2], so it seems odd that NULL/NULL is
getting returned in some cases, especially as that's what's returned if a
non-existent backend ID is provided.
[1]
https://www.postgresql.org/message-id/flat/CAFiTN-uvYAofNRaGF4R%2Bu6_OrABdkqNRoX7V6%2BPP3H_0HuYMwg%40mail.gmail.com
[2]
https://www.postgresql.org/message-id/flat/CAFiTN-ut0uwkRJDQJeDPXpVyTWD46m3gt3JDToE02hTfONEN%3DQ%40mail.gmail.com#821f6f40e91314066390efd06d71d5ac
Looking at the code, this is happening because
"pgstat_fetch_stat_local_beentry()"
expects to be passed the backend ID as an integer representing a 1-based index
referring to "localBackendStatusTable", but "pg_stat_get_backend_subxact()"
is presumably intended to take the actual BackendId , as per other
"pg_stat_get_XXX()"
functions.
Also, the comment for "pgstat_fetch_stat_local_beentry()" says:
   Returns NULL if the argument is out of range (no current caller does that).
so the last part is currently incorrect.
Assuming I am not misunderstanding something here (always a
possibility, apologies
in advance if this is merely noise), what is actually needed is a function which
accepts a BackendId (as per "pgstat_fetch_stat_beentry()"), but returns a
LocalPgBackendStatus (as per "pgstat_fetch_stat_local_beentry()") like the
attached, clumsily named "pgstat_fetch_stat_backend_local_beentry()".
Regards
Ian Barwick
			
		Вложения
On Thu, Aug 24, 2023 at 10:22:49AM +0900, Ian Lawrence Barwick wrote:
> Looking at the code, this is happening because
> "pgstat_fetch_stat_local_beentry()"
> expects to be passed the backend ID as an integer representing a 1-based index
> referring to "localBackendStatusTable", but "pg_stat_get_backend_subxact()"
> is presumably intended to take the actual BackendId , as per other
> "pg_stat_get_XXX()"
> functions.
Yes, this was changed in d7e39d7, but 10ea0f9 seems to have missed the
memo.
> Assuming I am not misunderstanding something here (always a
> possibility, apologies
> in advance if this is merely noise), what is actually needed is a function which
> accepts a BackendId (as per "pgstat_fetch_stat_beentry()"), but returns a
> LocalPgBackendStatus (as per "pgstat_fetch_stat_local_beentry()") like the
> attached, clumsily named "pgstat_fetch_stat_backend_local_beentry()".
I think you are right.  The relevant information is only available in
LocalPgBackendStatus, but there's presently no helper function for
obtaining the "local" status with the BackendId.
> +LocalPgBackendStatus *
> +pgstat_fetch_stat_backend_local_beentry(BackendId beid)
> +{
> +    LocalPgBackendStatus key;
> +
> +    pgstat_read_current_status();
> +
> +    /*
> +     * Since the localBackendStatusTable is in order by backend_id, we can use
> +     * bsearch() to search it efficiently.
> +     */
> +    key.backend_id = beid;
> +
> +    return (LocalPgBackendStatus *) bsearch(&key, localBackendStatusTable,
> +                                            localNumBackends,
> +                                            sizeof(LocalPgBackendStatus),
> +                                            cmp_lbestatus);
> +}
We could probably modify pgstat_fetch_stat_beentry() to use this new
function.  I suspect we'll want to work on the naming, too.  Maybe we could
name them pg_stat_fetch_local_beentry_by_index() and
pg_stat_fetch_local_beentry_by_backendid().
-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
			
		On Wed, Aug 23, 2023 at 07:32:06PM -0700, Nathan Bossart wrote: > On Thu, Aug 24, 2023 at 10:22:49AM +0900, Ian Lawrence Barwick wrote: >> Looking at the code, this is happening because >> "pgstat_fetch_stat_local_beentry()" >> expects to be passed the backend ID as an integer representing a 1-based index >> referring to "localBackendStatusTable", but "pg_stat_get_backend_subxact()" >> is presumably intended to take the actual BackendId , as per other >> "pg_stat_get_XXX()" >> functions. > > Yes, this was changed in d7e39d7, but 10ea0f9 seems to have missed the > memo. BTW I'd argue that this is a bug in v16 that we should try to fix before GA, so I've added an open item [0]. I assigned it to Robert (CC'd) since he was the committer, but I'm happy to pick it up. [0] https://wiki.postgresql.org/wiki/PostgreSQL_16_Open_Items#Open_Issues -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Wed, Aug 23, 2023 at 07:51:40PM -0700, Nathan Bossart wrote: > On Wed, Aug 23, 2023 at 07:32:06PM -0700, Nathan Bossart wrote: >> On Thu, Aug 24, 2023 at 10:22:49AM +0900, Ian Lawrence Barwick wrote: >>> Looking at the code, this is happening because >>> "pgstat_fetch_stat_local_beentry()" >>> expects to be passed the backend ID as an integer representing a 1-based index >>> referring to "localBackendStatusTable", but "pg_stat_get_backend_subxact()" >>> is presumably intended to take the actual BackendId , as per other >>> "pg_stat_get_XXX()" >>> functions. >> >> Yes, this was changed in d7e39d7, but 10ea0f9 seems to have missed the >> memo. > > BTW I'd argue that this is a bug in v16 that we should try to fix before > GA, so I've added an open item [0]. I assigned it to Robert (CC'd) since > he was the committer, but I'm happy to pick it up. Since RC1 is fast approaching, I put together a revised patch set. 0001 renames the existing pgstat_fetch_stat* functions, and 0002 adds pgstat_get_local_beentry_by_backend_id() and uses it for pg_stat_get_backend_subxact(). Thoughts? -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Вложения
2023年8月25日(金) 1:19 Nathan Bossart <nathandbossart@gmail.com>: > > On Wed, Aug 23, 2023 at 07:51:40PM -0700, Nathan Bossart wrote: > > On Wed, Aug 23, 2023 at 07:32:06PM -0700, Nathan Bossart wrote: > >> On Thu, Aug 24, 2023 at 10:22:49AM +0900, Ian Lawrence Barwick wrote: > >>> Looking at the code, this is happening because > >>> "pgstat_fetch_stat_local_beentry()" > >>> expects to be passed the backend ID as an integer representing a 1-based index > >>> referring to "localBackendStatusTable", but "pg_stat_get_backend_subxact()" > >>> is presumably intended to take the actual BackendId , as per other > >>> "pg_stat_get_XXX()" > >>> functions. > >> > >> Yes, this was changed in d7e39d7, but 10ea0f9 seems to have missed the > >> memo. > > > > BTW I'd argue that this is a bug in v16 that we should try to fix before > > GA, so I've added an open item [0]. I assigned it to Robert (CC'd) since > > he was the committer, but I'm happy to pick it up. > > Since RC1 is fast approaching, I put together a revised patch set. 0001 > renames the existing pgstat_fetch_stat* functions, and 0002 adds > pgstat_get_local_beentry_by_backend_id() and uses it for > pg_stat_get_backend_subxact(). Thoughts? Thanks for looking at this. In summary we now have these functions: extern PgBackendStatus *pgstat_get_beentry_by_backend_id(BackendId beid); extern LocalPgBackendStatus *pgstat_get_local_beentry_by_backend_id(BackendId beid); extern LocalPgBackendStatus *pgstat_get_local_beentry_by_index(int beid); which LGTM; patches work as expected and resolve the reported issue. Regards Ian Barwick
I tested the patch and it does the correct thing. I have a few comments: 1/ cast the return of bsearch. This was done previously and is the common convention in the code. So + return bsearch(&key, localBackendStatusTable, localNumBackends, + sizeof(LocalPgBackendStatus), cmp_lbestatus); Should be + return (LocalPgBackendStatus *) bsearch(&key, localBackendStatusTable, localNumBackends, + sizeof(LocalPgBackendStatus), cmp_lbestatus); 2/ This will probably be a good time to update the docs for pg_stat_get_backend_subxact [1] to call out that "subxact_count" will "only increase if a transaction is performing writes". Also to link the reader to the subtransactions doc [2]. 1. https://www.postgresql.org/docs/16/monitoring-stats.html#WAIT-EVENT-TIMEOUT-TABLE 2. https://www.postgresql.org/docs/16/subxacts.html Regards, Sami Imseih Amazon Web Services (AWS)
On Fri, Aug 25, 2023 at 09:36:18AM +0900, Ian Lawrence Barwick wrote: > Thanks for looking at this. In summary we now have these functions: > > extern PgBackendStatus *pgstat_get_beentry_by_backend_id(BackendId beid); > extern LocalPgBackendStatus > *pgstat_get_local_beentry_by_backend_id(BackendId beid); > extern LocalPgBackendStatus *pgstat_get_local_beentry_by_index(int beid); > > which LGTM; patches work as expected and resolve the reported issue. On second thought, renaming these exported functions so close to release is probably not a great idea. I should probably skip back-patching that one. Or I could have the existing functions call the new ones in v16 for backward compatibility... -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Fri, Aug 25, 2023 at 03:01:40PM +0000, Imseih (AWS), Sami wrote: > 1/ cast the return of bsearch. This was done previously and is the common > convention in the code. Will do. > 2/ This will probably be a good time to update the docs for pg_stat_get_backend_subxact [1] > to call out that "subxact_count" will "only increase if a transaction is performing writes". Also to link > the reader to the subtransactions doc [2]. I'd rather keep this patch focused on fixing the bug, given we are so close to the v16 release. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Fri, Aug 25, 2023 at 08:32:51AM -0700, Nathan Bossart wrote: > On second thought, renaming these exported functions so close to release is > probably not a great idea. I should probably skip back-patching that one. > Or I could have the existing functions call the new ones in v16 for > backward compatibility... Here is a new version of the patch that avoids changing the names of the existing functions. I'm not thrilled about the name (pgstat_fetch_stat_local_beentry_by_backend_id), so I am open to suggestions. In any case, I'd like to rename all three of the pgstat_fetch_stat_* functions in v17. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Вложения
> Here is a new version of the patch that avoids changing the names of the > existing functions. I'm not thrilled about the name > (pgstat_fetch_stat_local_beentry_by_backend_id), so I am open to > suggestions. In any case, I'd like to rename all three of the> > pgstat_fetch_stat_* functions in v17. Thanks for the updated patch. I reviewed/tested the latest version and I don't have any more comments. Regards, Sami
On Fri, Aug 25, 2023 at 10:56:14PM +0000, Imseih (AWS), Sami wrote:
> > Here is a new version of the patch that avoids changing the names of the
> > existing functions. I'm not thrilled about the name
> > (pgstat_fetch_stat_local_beentry_by_backend_id), so I am open to
> > suggestions. In any case, I'd like to rename all three of the>
> > pgstat_fetch_stat_* functions in v17.
>
> Thanks for the updated patch.
>
> I reviewed/tested the latest version and I don't have any more comments.
FWIW, I find the new routine introduced by this patch rather
confusing.  pgstat_fetch_stat_local_beentry() and
pgstat_fetch_stat_local_beentry_by_backend_id() use the same
argument name for a BackendId or an int.  This is not entirely the
fault of this patch as pg_stat_get_backend_subxact() itself is
confused about "beid" being a uint32 or a BackendId.  However, I think
that this makes much harder to figure out that
pgstat_fetch_stat_local_beentry() is only here because it is cheaper
to do sequential scan of all the local beentries rather than a
bsearch() for all its callers, while
pgstat_fetch_stat_local_beentry_by_backend_id() is here because we
want to retrieve the local beentry matching with the *backend ID* with
the binary search().
I understand that this is not a fantastic naming, but renaming
pgstat_fetch_stat_local_beentry() to something like
pgstat_fetch_stat_local_beentry_by_{index|position}_id() would make
the difference much easier to grasp, and we should avoid the use of
"beid" when we refer to the *position/index ID* in
localBackendStatusTable, because it is not a BackendId at all, just a
position in the local array.
--
Michael
			
		Вложения
On Mon, Aug 28, 2023 at 10:53:52AM +0900, Michael Paquier wrote:
> I understand that this is not a fantastic naming, but renaming
> pgstat_fetch_stat_local_beentry() to something like
> pgstat_fetch_stat_local_beentry_by_{index|position}_id() would make
> the difference much easier to grasp, and we should avoid the use of
> "beid" when we refer to the *position/index ID* in
> localBackendStatusTable, because it is not a BackendId at all, just a
> position in the local array.
This was my first reaction [0].  I was concerned about renaming the
exported functions so close to release, so I was suggesting that we hold
off on that part until v17.  If there isn't a concern with renaming these
functions in v16, I can proceed with something more like v2.
[0] https://postgr.es/m/20230824161913.GA1394441%40nathanxps13.lan
-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
			
		On Tue, Aug 29, 2023 at 09:46:55AM -0700, Nathan Bossart wrote: > This was my first reaction [0]. I was concerned about renaming the > exported functions so close to release, so I was suggesting that we hold > off on that part until v17. If there isn't a concern with renaming these > functions in v16, I can proceed with something more like v2. Thanks for the pointer. This version is much better IMO, because it removes entirely the source of the confusion between the difference in backend ID and index ID treatments when fetching the local entries in the array. So I'm okay to rename these functions now, before .0 is released to get things in a better shape while addressing the issue reported. +extern LocalPgBackendStatus *pgstat_get_local_beentry_by_index(int beid); Still I would to a bit more of s/beid/id/ for cases where the code refers to an index ID, and not a backend ID, especially for the internal routines. -- Michael
Вложения
On Wed, Aug 30, 2023 at 08:22:27AM +0900, Michael Paquier wrote: > On Tue, Aug 29, 2023 at 09:46:55AM -0700, Nathan Bossart wrote: >> This was my first reaction [0]. I was concerned about renaming the >> exported functions so close to release, so I was suggesting that we hold >> off on that part until v17. If there isn't a concern with renaming these >> functions in v16, I can proceed with something more like v2. > > Thanks for the pointer. This version is much better IMO, because it > removes entirely the source of the confusion between the difference in > backend ID and index ID treatments when fetching the local entries in > the array. So I'm okay to rename these functions now, before .0 is > released to get things in a better shape while addressing the issue > reported. Okay. > +extern LocalPgBackendStatus *pgstat_get_local_beentry_by_index(int beid); > > Still I would to a bit more of s/beid/id/ for cases where the code > refers to an index ID, and not a backend ID, especially for the > internal routines. Makes sense. I did this in v4. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Вложения
On Tue, Aug 29, 2023 at 07:01:51PM -0700, Nathan Bossart wrote: > On Wed, Aug 30, 2023 at 08:22:27AM +0900, Michael Paquier wrote: >> +extern LocalPgBackendStatus *pgstat_get_local_beentry_by_index(int beid); >> >> Still I would to a bit more of s/beid/id/ for cases where the code >> refers to an index ID, and not a backend ID, especially for the >> internal routines. > > Makes sense. I did this in v4. Yep, that looks more consistent, at quick glance. -- Michael
Вложения
On Wed, Aug 30, 2023 at 12:13 AM Michael Paquier <michael@paquier.xyz> wrote: > Yep, that looks more consistent, at quick glance. Sorry, I'm only just noticing this thread. Thanks, Nathan, Ian, and others, for your work on this. Apart from hoping that the 0002 patch will get a more detailed commit message spelling out the problem very explicitly, I don't have any comments on the proposed patches. -- Robert Haas EDB: http://www.enterprisedb.com
On Wed, Aug 30, 2023 at 09:50:41AM -0400, Robert Haas wrote: > Sorry, I'm only just noticing this thread. Thanks, Nathan, Ian, and > others, for your work on this. Apart from hoping that the 0002 patch > will get a more detailed commit message spelling out the problem very > explicitly, I don't have any comments on the proposed patches. I'm about to spend way too much time writing the commit message for 0002, but I plan to commit both patches sometime today. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Wed, Aug 30, 2023 at 10:27 AM Nathan Bossart <nathandbossart@gmail.com> wrote: > On Wed, Aug 30, 2023 at 09:50:41AM -0400, Robert Haas wrote: > > Sorry, I'm only just noticing this thread. Thanks, Nathan, Ian, and > > others, for your work on this. Apart from hoping that the 0002 patch > > will get a more detailed commit message spelling out the problem very > > explicitly, I don't have any comments on the proposed patches. > > I'm about to spend way too much time writing the commit message for 0002, > but I plan to commit both patches sometime today. Thanks! I'm glad your committing the patches, and I approve of you spending way too much time on the commit message. :-) -- Robert Haas EDB: http://www.enterprisedb.com
On Wed, Aug 30, 2023 at 10:56:22AM -0400, Robert Haas wrote: > On Wed, Aug 30, 2023 at 10:27 AM Nathan Bossart > <nathandbossart@gmail.com> wrote: >> I'm about to spend way too much time writing the commit message for 0002, >> but I plan to commit both patches sometime today. > > Thanks! I'm glad your committing the patches, and I approve of you > spending way too much time on the commit message. :-) Committed. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Wed, Aug 30, 2023 at 02:56:48PM -0700, Nathan Bossart wrote: > Committed. Cool, thanks! -- Michael
Вложения
On Thu, Aug 31, 2023 at 4:38 AM Nathan Bossart <nathandbossart@gmail.com> wrote: > > On Wed, Aug 30, 2023 at 10:56:22AM -0400, Robert Haas wrote: > > On Wed, Aug 30, 2023 at 10:27 AM Nathan Bossart > > <nathandbossart@gmail.com> wrote: > >> I'm about to spend way too much time writing the commit message for 0002, > >> but I plan to commit both patches sometime today. > > > > Thanks! I'm glad your committing the patches, and I approve of you > > spending way too much time on the commit message. :-) > > Committed. Sorry, I didn't notice this thread earlier. The new behavior looks better to me, thanks for working on it. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
2023年8月31日(木) 6:56 Nathan Bossart <nathandbossart@gmail.com>: > > On Wed, Aug 30, 2023 at 10:56:22AM -0400, Robert Haas wrote: > > On Wed, Aug 30, 2023 at 10:27 AM Nathan Bossart > > <nathandbossart@gmail.com> wrote: > >> I'm about to spend way too much time writing the commit message for 0002, > >> but I plan to commit both patches sometime today. > > > > Thanks! I'm glad your committing the patches, and I approve of you > > spending way too much time on the commit message. :-) > > Committed. Thanks for taking care of this (saw the commits, then got distracted by something else and forgot to follow up). Regards Ian Barwick