Обсуждение: Fix pg_stat_get_backend_wait_event() for aux processes

Поиск
Список
Период
Сортировка

Fix pg_stat_get_backend_wait_event() for aux processes

От
Heikki Linnakangas
Дата:
pg_stat_get_backend_wait_event() and 
pg_stat_get_backend_wait_event_type() functions don't work for aux 
processes:

> postgres=# select pid, backend_type, wait_event, wait_event_type from pg_stat_activity ;
>    pid   |         backend_type         |     wait_event      | wait_event_type 
> ---------+------------------------------+---------------------+-----------------
>  3665058 | client backend               |                     | 
>  3665051 | autovacuum launcher          | AutovacuumMain      | Activity
>  3665052 | logical replication launcher | LogicalLauncherMain | Activity
>  3665044 | io worker                    | IoWorkerMain        | Activity
>  3665045 | io worker                    | IoWorkerMain        | Activity
>  3665046 | io worker                    | IoWorkerMain        | Activity
>  3665047 | checkpointer                 | CheckpointerMain    | Activity
>  3665048 | background writer            | BgwriterMain        | Activity
>  3665050 | walwriter                    | WalWriterMain       | Activity
> (9 rows)
> 
> postgres=# SELECT pg_stat_get_backend_pid(backendid) AS pid,
>        pg_stat_get_backend_wait_event_type(backendid) as wait_event_type,
>        pg_stat_get_backend_wait_event(backendid) as wait_event
> FROM pg_stat_get_backend_idset() AS backendid;
>    pid   | wait_event_type |     wait_event      
> ---------+-----------------+---------------------
>  3665058 |                 | 
>  3665051 | Activity        | AutovacuumMain
>  3665052 | Activity        | LogicalLauncherMain
>  3665044 |                 | 
>  3665045 |                 | 
>  3665046 |                 | 
>  3665047 |                 | 
>  3665048 |                 | 
>  3665050 |                 | 
> (9 rows)

We added aux processes to pg_stat_activity in commit fc70a4b0df, but 
apparently forgot to do the same for those functions.

With the attached fix:

> postgres=# SELECT pg_stat_get_backend_pid(backendid) AS pid,
>        pg_stat_get_backend_wait_event_type(backendid) as wait_event_type,
>        pg_stat_get_backend_wait_event(backendid) as wait_event
> FROM pg_stat_get_backend_idset() AS backendid;
>    pid   | wait_event_type |     wait_event      
> ---------+-----------------+---------------------
>  3667552 |                 | 
>  3667545 | Activity        | AutovacuumMain
>  3667546 | Activity        | LogicalLauncherMain
>  3667538 | Activity        | IoWorkerMain
>  3667539 | Activity        | IoWorkerMain
>  3667540 | Activity        | IoWorkerMain
>  3667541 | Activity        | CheckpointerMain
>  3667542 | Activity        | BgwriterMain
>  3667544 | Activity        | WalWriterMain
> (9 rows)

While looking at this, I noticed that pg_stat_activity has a 
"backend_type" field, but there's no corresponding 
"pg_stat_get_backend_type(backend_id)" function, similar to 
"pg_stat_get_backend_wait_event(backend_id)" et al. I wonder if that was 
on purpose, or we just forgot to add it when we added it to 
pg_stat_activity?

Another thing I didn't do in this patch yet: I feel we should replace 
BackendPidGetProc() with a function like "PGPROC *PidGetPGProc(pid_t)", 
that would work for backends and aux processes alike. It's a common 
pattern to call BackendPidGetProc() followed by AuxiliaryPidGetProc() 
currently. Even for the callers that specifically want to only check 
backend processes, I think it would be more natural to call 
PidGetPGProc(), and then check the process type.

- Heikki

Вложения

Re: Fix pg_stat_get_backend_wait_event() for aux processes

От
Chao Li
Дата:

> On Feb 2, 2026, at 22:38, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>
> pg_stat_get_backend_wait_event() and pg_stat_get_backend_wait_event_type() functions don't work for aux processes:
>
>> postgres=# select pid, backend_type, wait_event, wait_event_type from pg_stat_activity ;
>>   pid   |         backend_type         |     wait_event      | wait_event_type
---------+------------------------------+---------------------+-----------------
>> 3665058 | client backend               |                     |  3665051 | autovacuum launcher          |
AutovacuumMain     | Activity 
>> 3665052 | logical replication launcher | LogicalLauncherMain | Activity
>> 3665044 | io worker                    | IoWorkerMain        | Activity
>> 3665045 | io worker                    | IoWorkerMain        | Activity
>> 3665046 | io worker                    | IoWorkerMain        | Activity
>> 3665047 | checkpointer                 | CheckpointerMain    | Activity
>> 3665048 | background writer            | BgwriterMain        | Activity
>> 3665050 | walwriter                    | WalWriterMain       | Activity
>> (9 rows)
>> postgres=# SELECT pg_stat_get_backend_pid(backendid) AS pid,
>>       pg_stat_get_backend_wait_event_type(backendid) as wait_event_type,
>>       pg_stat_get_backend_wait_event(backendid) as wait_event
>> FROM pg_stat_get_backend_idset() AS backendid;
>>   pid   | wait_event_type |     wait_event      ---------+-----------------+---------------------
>> 3665058 |                 |  3665051 | Activity        | AutovacuumMain
>> 3665052 | Activity        | LogicalLauncherMain
>> 3665044 |                 |  3665045 |                 |  3665046 |                 |  3665047 |                 |
3665048|                 |  3665050 |                 | (9 rows) 
>
> We added aux processes to pg_stat_activity in commit fc70a4b0df, but apparently forgot to do the same for those
functions.
>
> With the attached fix:
>
>> postgres=# SELECT pg_stat_get_backend_pid(backendid) AS pid,
>>       pg_stat_get_backend_wait_event_type(backendid) as wait_event_type,
>>       pg_stat_get_backend_wait_event(backendid) as wait_event
>> FROM pg_stat_get_backend_idset() AS backendid;
>>   pid   | wait_event_type |     wait_event      ---------+-----------------+---------------------
>> 3667552 |                 |  3667545 | Activity        | AutovacuumMain
>> 3667546 | Activity        | LogicalLauncherMain
>> 3667538 | Activity        | IoWorkerMain
>> 3667539 | Activity        | IoWorkerMain
>> 3667540 | Activity        | IoWorkerMain
>> 3667541 | Activity        | CheckpointerMain
>> 3667542 | Activity        | BgwriterMain
>> 3667544 | Activity        | WalWriterMain
>> (9 rows)
>
> While looking at this, I noticed that pg_stat_activity has a "backend_type" field, but there's no corresponding
"pg_stat_get_backend_type(backend_id)"function, similar to "pg_stat_get_backend_wait_event(backend_id)" et al. I wonder
ifthat was on purpose, or we just forgot to add it when we added it to pg_stat_activity? 
>
> Another thing I didn't do in this patch yet: I feel we should replace BackendPidGetProc() with a function like
"PGPROC*PidGetPGProc(pid_t)", that would work for backends and aux processes alike. It's a common pattern to call
BackendPidGetProc()followed by AuxiliaryPidGetProc() currently. Even for the callers that specifically want to only
checkbackend processes, I think it would be more natural to call PidGetPGProc(), and then check the process type. 
>
> - Heikki
> <0001-Fix-pg_stat_get_backend_wait_event-for-aux-processes.patch>

Hi Heikki,

I reviewed and tested the patch, it works well, and the code change looks solid to me.

I only have one small comment. In the following case:
```
     if ((beentry = pgstat_get_beentry_by_proc_number(procNumber)) == NULL)
         wait_event_type = "<backend information not available>";
```

With this patch, aux processes are now supported as well. Do we want to update this message?

For example, in my test system max_connections = 100, so procNumber >= 100 corresponds to aux processes. If I run:
```
evantest=# select pg_stat_get_backend_wait_event(188);
   pg_stat_get_backend_wait_event
-------------------------------------
 <backend information not available>
(1 row)
```

Here 188 refers to an aux process, but the message still says “backend information”, which feels a bit misleading.
Wouldit make sense to change this to something like “process information not available”? 

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/







Re: Fix pg_stat_get_backend_wait_event() for aux processes

От
Kyotaro Horiguchi
Дата:
At Tue, 3 Feb 2026 12:47:34 +0800, Chao Li <li.evan.chao@gmail.com> wrote in 
> I reviewed and tested the patch, it works well, and the code change
> looks solid to me.

It seems to have sufficient reliability.

> I only have one small comment. In the following case:
> ```
>      if ((beentry = pgstat_get_beentry_by_proc_number(procNumber)) == NULL)
>          wait_event_type = "<backend information not available>";
> ```
> 
> With this patch, aux processes are now supported as well. Do we want to =
> update this message?
> 
> For example, in my test system max_connections =3D 100, so procNumber >=3D=
>  100 corresponds to aux processes. If I run:
> ```
> evantest=3D# select pg_stat_get_backend_wait_event(188);
>    pg_stat_get_backend_wait_event
> -------------------------------------
>  <backend information not available>
> (1 row)
> ```
> 
> Here 188 refers to an aux process, but the message still says
> "backend information", which feels a bit misleading. Would it make
> sense to change this to something like "process information not
> available"

pg_stat_get_backend_idset() is documented as returning backend IDs, and
even its name suggests that it deals specifically with backends, but
the returned set also includes aux processes. The glossary, however,
defines a backend as:

> Backend (process)
> Process of an instance which acts on behalf of a client session and
> handles its requests.

This makes the scope of the term "backend" a bit unclear in this
context.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Fix pg_stat_get_backend_wait_event() for aux processes

От
Sami Imseih
Дата:
Hi,

> pg_stat_get_backend_wait_event() and
> pg_stat_get_backend_wait_event_type() functions don't work for aux
> processes:

> We added aux processes to pg_stat_activity in commit fc70a4b0df, but
> apparently forgot to do the same for those functions.

Yes, this looks like an oversight. It has probably gone unreported all
this time because pg_stat_activity is the more popular choice
for retrieving this information.

> With the attached fix:
>
> > postgres=# SELECT pg_stat_get_backend_pid(backendid) AS pid,
> >        pg_stat_get_backend_wait_event_type(backendid) as wait_event_type,
> >        pg_stat_get_backend_wait_event(backendid) as wait_event
> > FROM pg_stat_get_backend_idset() AS backendid;
> >    pid   | wait_event_type |     wait_event
> > ---------+-----------------+---------------------
> >  3667552 |                 |
> >  3667545 | Activity        | AutovacuumMain
> >  3667546 | Activity        | LogicalLauncherMain
> >  3667538 | Activity        | IoWorkerMain
> >  3667539 | Activity        | IoWorkerMain
> >  3667540 | Activity        | IoWorkerMain
> >  3667541 | Activity        | CheckpointerMain
> >  3667542 | Activity        | BgwriterMain
> >  3667544 | Activity        | WalWriterMain
> > (9 rows)
>
> While looking at this, I noticed that pg_stat_activity has a
> "backend_type" field, but there's no corresponding
> "pg_stat_get_backend_type(backend_id)" function, similar to
> "pg_stat_get_backend_wait_event(backend_id)" et al. I wonder if that was
> on purpose, or we just forgot to add it when we added it to
> pg_stat_activity?

Looks like other fields from pg_stat_activity are missing corresponding
pg_stat_get_backend_ functions as well. i.e., query_id, client_hostname,
application_name, state_change, backend_xmin, backend_xmax. Not sure
what the reason these were left out either.

It also should be noted that the information from pg_stat_get_backend_subxact
cannot be retrieved from pg_stat_activity.

> Another thing I didn't do in this patch yet: I feel we should replace
> BackendPidGetProc() with a function like "PGPROC *PidGetPGProc(pid_t)",
> that would work for backends and aux processes alike. It's a common
> pattern to call BackendPidGetProc() followed by AuxiliaryPidGetProc()
> currently. Even for the callers that specifically want to only check
> backend processes, I think it would be more natural to call
> PidGetPGProc(), and then check the process type.

+1 for such a function, and it could replace 6 different places ( if I counted
correctly ) in code where this pattern is used. At minimum, shouldn't
the fix for pg_stat_get_backend_wait_event() and
pg_stat_get_backend_wait_event_type() follow the same pattern?

"
proc = BackendPidGetProc(pid);
if (proc == NULL)
        proc = AuxiliaryPidGetProc(pid);
"

--
Sami Imseih
Amazon Web Services (AWS)



Re: Fix pg_stat_get_backend_wait_event() for aux processes

От
Heikki Linnakangas
Дата:
On 03/02/2026 14:46, Sami Imseih wrote:
>> Another thing I didn't do in this patch yet: I feel we should replace
>> BackendPidGetProc() with a function like "PGPROC *PidGetPGProc(pid_t)",
>> that would work for backends and aux processes alike. It's a common
>> pattern to call BackendPidGetProc() followed by AuxiliaryPidGetProc()
>> currently. Even for the callers that specifically want to only check
>> backend processes, I think it would be more natural to call
>> PidGetPGProc(), and then check the process type.
> 
> +1 for such a function, and it could replace 6 different places ( if I counted
> correctly ) in code where this pattern is used. At minimum, shouldn't
> the fix for pg_stat_get_backend_wait_event() and
> pg_stat_get_backend_wait_event_type() follow the same pattern?
> 
> "
> proc = BackendPidGetProc(pid);
> if (proc == NULL)
>          proc = AuxiliaryPidGetProc(pid);
> "

Yeah, that would be the most straightforward fix. But it feels silly to 
call BackendPidGetProc(pid), when we already have the ProcNumber at hand.

Come to think of it, why is wait_event_info stored in PGPROC in the 
first place, rather than in PgBackendStatus? All the other 
pg_stat_get_backend_*() functions just read the local PgBackendStatus copy.

That point was debated when the wait events were introduced [1] [2]. 
AFAICS the main motivation was that aux processes didn't have 
PgBackendStatus entries, and we wanted to expose wait events for aux 
processes too. That has changed since then, aux processes do have 
PgBackendStatus entries now, so that argument is moot.

Because wait_event_info is fetched from PGPROC, it's not part of the 
"activity snapshot". So when you run "select * pg_stat_activity" 
repeatedly in the same transaction, the wait_events will change, even 
though the other fields are fetched once and frozen for the duration of 
the transaction. Tom pointed this out back then [1], but looks like that 
point was then forgotten, as we haven't documented that exception either.

There might be a performance argument too, although I haven't done any 
benchmarking and it's probably not really significant: PgBackendStatus 
is accessed less frequently by other backends than PGPROC, so you might 
get less cache line bouncing if wait_event_info is in PgBackendStatus 
instead of PGPROC.

So how about moving wait_event_info to PgBackendStatus, per attached?

[1] https://www.postgresql.org/message-id/4067.1439561494%40sss.pgh.pa.us

[2] 
https://www.postgresql.org/message-id/CA%2BTgmoZ-8ZpoUM9BGtBUP1u4dUQhC-9EpEDLzyK0dG4pKMDUwQ%40mail.gmail.com

- Heikki

Вложения

Re: Fix pg_stat_get_backend_wait_event() for aux processes

От
Bertrand Drouvot
Дата:
Hi,

On Tue, Feb 03, 2026 at 10:29:27PM +0200, Heikki Linnakangas wrote:
> There might be a performance argument too,

yeah, not sure but with the patch in place the size of PGPROC goes from
832 bytes to 824 bytes. Is it worth to add extra padding so that it still remain
a multiple of 64?

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Fix pg_stat_get_backend_wait_event() for aux processes

От
Heikki Linnakangas
Дата:
On 04/02/2026 10:02, Bertrand Drouvot wrote:
> On Tue, Feb 03, 2026 at 10:29:27PM +0200, Heikki Linnakangas wrote:
>> There might be a performance argument too,
> 
> yeah, not sure but with the patch in place the size of PGPROC goes from
> 832 bytes to 824 bytes. Is it worth to add extra padding so that it still remain
> a multiple of 64?

Hmm, I don't think so. We've never given cacheline alignment any thought 
when we've changed the PGPROC fields in the past (or at least I 
haven't). Perhaps we should, but it would warrant a separate investigation.

Now that I look at that, the most frequently accessed fields are not at 
the beginning or end of the struct, so I don't think there's much harm 
in sharing cache lines. And the really hot GetSnapshotData() function 
uses the "mirrored" arrays anyway.

- Heikki




Re: Fix pg_stat_get_backend_wait_event() for aux processes

От
Sami Imseih
Дата:
> Come to think of it, why is wait_event_info stored in PGPROC in the
> first place, rather than in PgBackendStatus? All the other
> pg_stat_get_backend_*() functions just read the local PgBackendStatus copy.
>
> That point was debated when the wait events were introduced [1] [2].
> AFAICS the main motivation was that aux processes didn't have
> PgBackendStatus entries, and we wanted to expose wait events for aux
> processes too. That has changed since then, aux processes do have
> PgBackendStatus entries now, so that argument is moot.
>
> Because wait_event_info is fetched from PGPROC, it's not part of the
> "activity snapshot". So when you run "select * pg_stat_activity"
> repeatedly in the same transaction, the wait_events will change, even
> though the other fields are fetched once and frozen for the duration of
> the transaction. Tom pointed this out back then [1], but looks like that
> point was then forgotten, as we haven't documented that exception either.

Yeah, I agree with moving wait events to backend status.

There is also a discussion [0] about wait event/activity field inconsistency
with pg_stat_activity with a repro in [1]. This led to commit
f056f75dafd00, which
added a note section in the docs to highlight this behavior. I don't think
your proposed patch makes the note added in that commit invalid, but
I can see it fixing the behavior identified in [1]. So I am +1 for this change.

Here are some comments on 0001.

1/
     * Similarly, stop reporting wait events to MyProc->wait_event_info.
to

     * Similarly, stop reporting wait events to
PgBackendStatus->st_wait_event_info.

2/

+
+       /*
+        * Proc's wait information.  This *not* protected by the changecount
+        * mechanism, because reading and writing an uint32 is assumed
to atomic.
+        * This is updated very frequently, so we want to keep the overhead as
+        * small as possible.
+        */
+       uint32          st_wait_event_info;
+

Using or bypassing changecount meschanism occurs on access. Maybe we should say
"Proc's wait information. Since this is a uint32 and is assumed to be atomic, a
caller should not need to use the changecount mechanism to read/write."

What do you think?

3/

+ * pgstat_get_backend_type_by_proc_number() -
+ *
+ *     Return the type of the backend with the specified ProcNumber.
This looks
+ *     directly at the BackendStatusArray, so the return value may be
out of date.
+ *     The only current use of this function is in
pg_signal_backend(), which is
+ *     inherently racy, so we don't worry too much about this.
+ *
+ *     It is the caller's responsibility to use this wisely; at
minimum, callers
+ *     should ensure that procNumber is valid and perform the
required permissions
+ *     checks.
+ * ----------
+ */
+BackendType
+pgstat_get_backend_type_by_proc_number(ProcNumber procNumber)

+extern BackendType pgstat_get_backend_type_by_proc_number(ProcNumber
procNumber);


Maybe I am missing something, but I don't see
pgstat_get_backend_type_by_proc_number
being used.


--
Sami Imseih
Amazon Web Services

[0] https://www.postgresql.org/message-id/20220708.113925.694736747577500484.horikyota.ntt%40gmail.com
[1]
https://www.postgresql.org/message-id/flat/20220708.113925.694736747577500484.horikyota.ntt%40gmail.com#b85744cec6fb75c5b038f39d7802e602
f



Re: Fix pg_stat_get_backend_wait_event() for aux processes

От
Sami Imseih
Дата:
There is also a discussion [0] about wait event/activity field inconsistency
with pg_stat_activity with a repro in [1].

The repro I was referring to in [1] is actually

I linked a different message earlier.

--
Sami



Re: Fix pg_stat_get_backend_wait_event() for aux processes

От
Sami Imseih
Дата:

3/

+ * pgstat_get_backend_type_by_proc_number() -
+ *
+ *     Return the type of the backend with the specified ProcNumber.
This looks
+ *     directly at the BackendStatusArray, so the return value may be
out of date.
+ *     The only current use of this function is in
pg_signal_backend(), which is
+ *     inherently racy, so we don't worry too much about this.
+ *
+ *     It is the caller's responsibility to use this wisely; at
minimum, callers
+ *     should ensure that procNumber is valid and perform the
required permissions
+ *     checks.
+ * ----------
+ */
+BackendType
+pgstat_get_backend_type_by_proc_number(ProcNumber procNumber)

+extern BackendType pgstat_get_backend_type_by_proc_number(ProcNumber
procNumber);


Maybe I am missing something, but I don't see
pgstat_get_backend_type_by_proc_number
being used.

Disregard this comment please. It looks like this was due to 084e42b after rebasing
0001 to test.

--
Sami Imseih


Re: Fix pg_stat_get_backend_wait_event() for aux processes

От
Rahila Syed
Дата:
Hi,
 

Another thing I didn't do in this patch yet: I feel we should replace
BackendPidGetProc() with a function like "PGPROC *PidGetPGProc(pid_t)",
that would work for backends and aux processes alike. It's a common
pattern to call BackendPidGetProc() followed by AuxiliaryPidGetProc()
currently. Even for the callers that specifically want to only check
backend processes, I think it would be more natural to call
PidGetPGProc(), and then check the process type.


+1 for the idea, do you also intend to remove AuxiliaryPidGetProc() as
part of this change,  given that all the occurrences of it are coupled with
BackendPidGetProc() ?

Thank you,
Rahila Syed