Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

Поиск
Список
Период
Сортировка
От Melanie Plageman
Тема Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)
Дата
Msg-id CAAKRu_bwAmyQ=3SfxG3Jb8rRUBjoMwJtT99D6gG5fqqLmVpCQA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)  (Andres Freund <andres@anarazel.de>)
Ответы Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)  (Andres Freund <andres@anarazel.de>)
Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Список pgsql-hackers
v18 attached.

On Thu, Dec 9, 2021 at 2:17 PM Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2021-12-03 15:02:24 -0500, Melanie Plageman wrote:
> > From e0f7f3dfd60a68fa01f3c023bcdb69305ade3738 Mon Sep 17 00:00:00 2001
> > From: Melanie Plageman <melanieplageman@gmail.com>
> > Date: Mon, 11 Oct 2021 16:15:06 -0400
> > Subject: [PATCH v17 1/7] Read-only atomic backend write function
> >
> > For counters in shared memory which can be read by any backend but only
> > written to by one backend, an atomic is still needed to protect against
> > torn values, however, pg_atomic_fetch_add_u64() is overkill for
> > incrementing the counter. pg_atomic_inc_counter() is a helper function
> > which can be used to increment these values safely but without
> > unnecessary overhead.
> >
> > Author: Thomas Munro
> > ---
> >  src/include/port/atomics.h | 11 +++++++++++
> >  1 file changed, 11 insertions(+)
> >
> > diff --git a/src/include/port/atomics.h b/src/include/port/atomics.h
> > index 856338f161..39ffff24dd 100644
> > --- a/src/include/port/atomics.h
> > +++ b/src/include/port/atomics.h
> > @@ -519,6 +519,17 @@ pg_atomic_sub_fetch_u64(volatile pg_atomic_uint64 *ptr, int64 sub_)
> >       return pg_atomic_sub_fetch_u64_impl(ptr, sub_);
> >  }
> >
> > +/*
> > + * On modern systems this is really just *counter++.  On some older systems
> > + * there might be more to it, due to inability to read and write 64 bit values
> > + * atomically.
> > + */
> > +static inline void
> > +pg_atomic_inc_counter(pg_atomic_uint64 *counter)
> > +{
> > +     pg_atomic_write_u64(counter, pg_atomic_read_u64(counter) + 1);
> > +}
>
> I wonder if it's worth putting something in the name indicating that this is
> not actual atomic RMW operation. Perhaps adding _unlocked?
>

Done.

>
> > From b0e193cfa08f0b8cf1be929f26fe38f06a39aeae Mon Sep 17 00:00:00 2001
> > From: Melanie Plageman <melanieplageman@gmail.com>
> > Date: Wed, 24 Nov 2021 10:32:56 -0500
> > Subject: [PATCH v17 2/7] Add IO operation counters to PgBackendStatus
> >
> > Add an array of counters in PgBackendStatus which count the buffers
> > allocated, extended, fsynced, and written by a given backend. Each "IO
> > Op" (alloc, fsync, extend, write) is counted per "IO Path" (direct,
> > local, shared, or strategy). "local" and "shared" IO Path counters count
> > operations on local and shared buffers. The "strategy" IO Path counts
> > buffers alloc'd/written/read/fsync'd as part of a BufferAccessStrategy.
> > The "direct" IO Path counts blocks of IO which are read, written, or
> > fsync'd using smgrwrite/extend/immedsync directly (as opposed to through
> > [Local]BufferAlloc()).
> >
> > With this commit, all backends increment a counter in their
> > PgBackendStatus when performing an IO operation. This is in preparation
> > for future commits which will persist these stats upon backend exit and
> > use the counters to provide observability of database IO operations.
> >
> > Note that this commit does not add code to increment the "direct" path.
> > A separate proposed patch [1] which would add wrappers for smgrwrite(),
> > smgrextend(), and smgrimmedsync() would provide a good location to call
> > pgstat_inc_ioop() for unbuffered IO and avoid regressions for future
> > users of these functions.
> >
> > [1] https://www.postgresql.org/message-id/CAAKRu_aw72w70X1P%3Dba20K8iGUvSkyz7Yk03wPPh3f9WgmcJ3g%40mail.gmail.com
>
> On longer thread it's nice for committers to already have Reviewed-By: in the
> commit message.

Done.

> > diff --git a/src/backend/utils/activity/backend_status.c b/src/backend/utils/activity/backend_status.c
> > index 7229598822..413cc605f8 100644
> > --- a/src/backend/utils/activity/backend_status.c
> > +++ b/src/backend/utils/activity/backend_status.c
> > @@ -399,6 +399,15 @@ pgstat_bestart(void)
> >       lbeentry.st_progress_command = PROGRESS_COMMAND_INVALID;
> >       lbeentry.st_progress_command_target = InvalidOid;
> >       lbeentry.st_query_id = UINT64CONST(0);
> > +     for (int io_path = 0; io_path < IOPATH_NUM_TYPES; io_path++)
> > +     {
> > +             IOOps      *io_ops = &lbeentry.io_path_stats[io_path];
> > +
> > +             pg_atomic_init_u64(&io_ops->allocs, 0);
> > +             pg_atomic_init_u64(&io_ops->extends, 0);
> > +             pg_atomic_init_u64(&io_ops->fsyncs, 0);
> > +             pg_atomic_init_u64(&io_ops->writes, 0);
> > +     }
> >
> >       /*
> >        * we don't zero st_progress_param here to save cycles; nobody should
>
> nit: I think we nearly always have a blank line before loops

Done.

> > diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
> > index 646126edee..93f1b4bcfc 100644
> > --- a/src/backend/utils/init/postinit.c
> > +++ b/src/backend/utils/init/postinit.c
> > @@ -623,6 +623,7 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username,
> >               RegisterTimeout(CLIENT_CONNECTION_CHECK_TIMEOUT, ClientCheckTimeoutHandler);
> >       }
> >
> > +     pgstat_beinit();
> >       /*
> >        * Initialize local process's access to XLOG.
> >        */
>
> nit: same with multi-line comments.

Done.

> > @@ -649,6 +650,7 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username,
> >                */
> >               CreateAuxProcessResourceOwner();
> >
> > +             pgstat_bestart();
> >               StartupXLOG();
> >               /* Release (and warn about) any buffer pins leaked in StartupXLOG */
> >               ReleaseAuxProcessResources(true);
> > @@ -676,7 +678,6 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username,
> >       EnablePortalManager();
> >
> >       /* Initialize status reporting */
> > -     pgstat_beinit();
>
> I'd like to see changes like moving this kind of thing around broken around
> and committed separately. It's much easier to pinpoint breakage if the CF
> breaks after moving just pgstat_beinit() around, rather than when committing
> this considerably larger patch. And reordering subsystem initialization has
> the habit of causing problems...

Done

> > +/* ----------
> > + * IO Stats reporting utility types
> > + * ----------
> > + */
> > +
> > +typedef enum IOOp
> > +{
> > +     IOOP_ALLOC,
> > +     IOOP_EXTEND,
> > +     IOOP_FSYNC,
> > +     IOOP_WRITE,
> > +} IOOp;
> > [...]
> > +/*
> > + * Structure for counting all types of IOOps for a live backend.
> > + */
> > +typedef struct IOOps
> > +{
> > +     pg_atomic_uint64 allocs;
> > +     pg_atomic_uint64 extends;
> > +     pg_atomic_uint64 fsyncs;
> > +     pg_atomic_uint64 writes;
> > +} IOOps;
>
> To me IOop and IOOps sound to much alike - even though they're really kind of
> separate things. s/IOOps/IOOpCounters/ maybe?

Done.

> > @@ -3152,6 +3156,14 @@ pgstat_shutdown_hook(int code, Datum arg)
> >  {
> >       Assert(!pgstat_is_shutdown);
> >
> > +     /*
> > +      * Only need to send stats on IO Ops for IO Paths when a process exits.
> > +      * Users requiring IO Ops for both live and exited backends can read from
> > +      * live backends' PgBackendStatus and sum this with totals from exited
> > +      * backends persisted by the stats collector.
> > +      */
> > +     pgstat_send_buffers();
>
> Perhaps something like this comment belongs somewhere at the top of the file,
> or in the header, or ...? It's a fairly central design piece, and it's not
> obvious one would need to look in the shutdown hook for it?
>

now in pgstat.h above the declaration of pgstat_send_buffers()

> > +/*
> > + * Before exiting, a backend sends its IO op statistics to the collector so
> > + * that they may be persisted.
> > + */
> > +void
> > +pgstat_send_buffers(void)
> > +{
> > +     PgStat_MsgIOPathOps msg;
> > +
> > +     PgBackendStatus *beentry = MyBEEntry;
> > +
> > +     /*
> > +      * Though some backends with type B_INVALID (such as the single-user mode
> > +      * process) do initialize and increment IO operations stats, there is no
> > +      * spot in the array of IO operations for backends of type B_INVALID. As
> > +      * such, do not send these to the stats collector.
> > +      */
> > +     if (!beentry || beentry->st_backendType == B_INVALID)
> > +             return;
>
> Why does single user mode use B_INVALID? That doesn't seem quite right.

I think PgBackendStatus->st_backendType is set from MyBackendType which
isn't set for the single user mode process. What BackendType would you
expect to see?

> > +     memset(&msg, 0, sizeof(msg));
> > +     msg.backend_type = beentry->st_backendType;
> > +
> > +     pgstat_sum_io_path_ops(msg.iop.io_path_ops,
> > +                                                (IOOps *) &beentry->io_path_stats);
> > +
> > +     pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_IO_PATH_OPS);
> > +     pgstat_send(&msg, sizeof(msg));
> > +}
>
> It seems worth having a path skipping sending the message if there was no IO?

Makes sense. I've updated pgstat_send_buffers() to do a loop after calling
pgstat_sum_io_path_ops() and check if it should skip sending.

I also thought about having pgstat_sum_io_path_ops() return a value to
indicate if everything was 0 -- which could be useful to future callers
potentially?

I didn't do this because I am not sure what the return value would be.
It could be a bool and be true if any IO was done and false if none was
done -- but that doesn't really make sense given the function's name it
would be called like
if (!pgstat_sum_io_path_ops())
  return
which I'm not sure is very clear

> > +/*
> > + * Helper function to sum all live IO Op stats for all IO Paths (e.g. shared,
> > + * local) to those in the equivalent stats structure for exited backends. Note
> > + * that this adds and doesn't set, so the destination stats structure should be
> > + * zeroed out by the caller initially. This would commonly be used to transfer
> > + * all IO Op stats for all IO Paths for a particular backend type to the
> > + * pgstats structure.
> > + */
> > +void
> > +pgstat_sum_io_path_ops(PgStatIOOps *dest, IOOps *src)
> > +{
> > +     for (int io_path = 0; io_path < IOPATH_NUM_TYPES; io_path++)
> > +     {
>
> Sacriligeous, but I find io_path a harder to understand variable name for the
> counter than i (or io_path_off or ...) ;)

I've updated almost all my non-standard loop index variable names.

> > +static void
> > +pgstat_recv_io_path_ops(PgStat_MsgIOPathOps *msg, int len)
> > +{
> > +     PgStatIOOps *src_io_path_ops;
> > +     PgStatIOOps *dest_io_path_ops;
> > +
> > +     /*
> > +      * Subtract 1 from message's BackendType to get a valid index into the
> > +      * array of IO Ops which does not include an entry for B_INVALID
> > +      * BackendType.
> > +      */
> > +     Assert(msg->backend_type > B_INVALID);
>
> Probably worth also asserting the upper boundary?

Done.

> > From f972ea87270feaed464a74fb6541ac04b4fc7d98 Mon Sep 17 00:00:00 2001
> > From: Melanie Plageman <melanieplageman@gmail.com>
> > Date: Wed, 24 Nov 2021 11:39:48 -0500
> > Subject: [PATCH v17 4/7] Add "buffers" to pgstat_reset_shared_counters
> >
> > Backends count IO operations for various IO paths in their PgBackendStatus.
> > Upon exit, they send these counts to the stats collector. Prior to this commit,
> > these IO Ops stats would have been reset when the target was "bgwriter".
> >
> > With this commit, target "bgwriter" no longer will cause the IO operations
> > stats to be reset, and the IO operations stats can be reset with new target,
> > "buffers".
> > ---
> >  doc/src/sgml/monitoring.sgml                |  2 +-
> >  src/backend/postmaster/pgstat.c             | 83 +++++++++++++++++++--
> >  src/backend/utils/activity/backend_status.c | 29 +++++++
> >  src/include/pgstat.h                        |  8 +-
> >  src/include/utils/backend_status.h          |  2 +
> >  5 files changed, 117 insertions(+), 7 deletions(-)
> >
> > diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
> > index 62f2a3332b..bda3eef309 100644
> > --- a/doc/src/sgml/monitoring.sgml
> > +++ b/doc/src/sgml/monitoring.sgml
> > @@ -3604,7 +3604,7 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
> >         <structfield>stats_reset</structfield> <type>timestamp with time zone</type>
> >        </para>
> >        <para>
> > -       Time at which these statistics were last reset
> > +       Time at which these statistics were last reset.
> >        </para></entry>
> >       </row>
> >      </tbody>
>
> Hm?
>
> Shouldn't this new reset target be documented?

It is in the commit adding the view. I didn't include it in this commit
because the pg_stat_buffers view doesn't exist yet, as of this commit,
and I thought it would be odd to mention it in the docs (in this
commit).
As an aside, I shouldn't have left this correction in this commit. I
moved it now to the other one.

> > +/*
> > + * Helper function to collect and send live backends' current IO operations
> > + * stats counters when a stats reset is initiated so that they may be deducted
> > + * from future totals.
> > + */
> > +static void
> > +pgstat_send_buffers_reset(PgStat_MsgResetsharedcounter *msg)
> > +{
> > +     PgStatIOPathOps ops[BACKEND_NUM_TYPES];
> > +
> > +     memset(ops, 0, sizeof(ops));
> > +     pgstat_report_live_backend_io_path_ops(ops);
> > +
> > +     /*
> > +      * Iterate through the array of IO Ops for all IO Paths for each
> > +      * BackendType. Because the array does not include a spot for BackendType
> > +      * B_INVALID, add 1 to the index when setting backend_type so that there is
> > +      * no confusion as to the BackendType with which this reset message
> > +      * corresponds.
> > +      */
> > +     for (int backend_type_idx = 0; backend_type_idx < BACKEND_NUM_TYPES; backend_type_idx++)
> > +     {
> > +             msg->m_backend_resets.backend_type = backend_type_idx + 1;
> > +             memcpy(&msg->m_backend_resets.iop, &ops[backend_type_idx],
> > +                             sizeof(msg->m_backend_resets.iop));
> > +             pgstat_send(msg, sizeof(PgStat_MsgResetsharedcounter));
> > +     }
> > +}
>
> Probably worth explaining why multiple messages are sent?

Done.

> > @@ -5583,10 +5621,45 @@ pgstat_recv_resetsharedcounter(PgStat_MsgResetsharedcounter *msg, int len)
> >  {
> >       if (msg->m_resettarget == RESET_BGWRITER)
> >       {
> > -             /* Reset the global, bgwriter and checkpointer statistics for the cluster. */
> > -             memset(&globalStats, 0, sizeof(globalStats));
> > +             /*
> > +              * Reset the global bgwriter and checkpointer statistics for the
> > +              * cluster.
> > +              */
> > +             memset(&globalStats.checkpointer, 0, sizeof(globalStats.checkpointer));
> > +             memset(&globalStats.bgwriter, 0, sizeof(globalStats.bgwriter));
> >               globalStats.bgwriter.stat_reset_timestamp = GetCurrentTimestamp();
> >       }
>
> Oh, is this a live bug?

I don't think it is a bug. globalStats only contained bgwriter and
checkpointer stats and those were all only displayed in
pg_stat_bgwriter(), so memsetting the whole thing seems fine.

> > +             /*
> > +              * Subtract 1 from the BackendType to arrive at a valid index in the
> > +              * array, as it does not contain a spot for B_INVALID BackendType.
> > +              */
>
> Instead of repeating a comment about +- 1 in a bunch of places, would it look
> better to have two helper inline functions for this purpose?

Done.

> > +/*
> > +* When adding a new column to the pg_stat_buffers view, add a new enum
> > +* value here above COLUMN_LENGTH.
> > +*/
> > +enum
> > +{
> > +     COLUMN_BACKEND_TYPE,
> > +     COLUMN_IO_PATH,
> > +     COLUMN_ALLOCS,
> > +     COLUMN_EXTENDS,
> > +     COLUMN_FSYNCS,
> > +     COLUMN_WRITES,
> > +     COLUMN_RESET_TIME,
> > +     COLUMN_LENGTH,
> > +};
>
> COLUMN_LENGTH seems like a fairly generic name...

Changed.

> > From 9f22da9041e1e1fbc0ef003f5f78f4e72274d438 Mon Sep 17 00:00:00 2001
> > From: Melanie Plageman <melanieplageman@gmail.com>
> > Date: Wed, 24 Nov 2021 12:20:10 -0500
> > Subject: [PATCH v17 6/7] Remove superfluous bgwriter stats
> >
> > Remove stats from pg_stat_bgwriter which are now more clearly expressed
> > in pg_stat_buffers.
> >
> > TODO:
> > - make pg_stat_checkpointer view and move relevant stats into it
> > - add additional stats to pg_stat_bgwriter
>
> When do you think it makes sense to tackle these wrt committing some of the
> patches?

Well, the new stats are a superset of the old stats (no stats have been
removed that are not represented in the new or old views). So, I don't
see that as a blocker for committing these patches.

Since it is weird that pg_stat_bgwriter had mostly checkpointer stats,
I've edited this commit to rename that view to pg_stat_checkpointer.

I have not made a separate view just for maxwritten_clean (presumably
called pg_stat_bgwriter), but I would not be opposed to doing this if
you thought having a view with a single column isn't a problem (in the
event that we don't get around to adding more bgwriter stats right
away).

I noticed after changing the docs on the "bgwriter" target for
pg_stat_reset_shared to say "checkpointer", that it still said "bgwriter" in
  src/backend/po/ko.po
  src/backend/po/it.po
  ...
I presume these are automatically updated with some incantation, but I wasn't
sure what it was nor could I find documentation on this.

> > diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
> > index 6926fc5742..67447f997a 100644
> > --- a/src/backend/storage/buffer/bufmgr.c
> > +++ b/src/backend/storage/buffer/bufmgr.c
> > @@ -2164,7 +2164,6 @@ BufferSync(int flags)
> >                       if (SyncOneBuffer(buf_id, false, &wb_context) & BUF_WRITTEN)
> >                       {
> >                               TRACE_POSTGRESQL_BUFFER_SYNC_WRITTEN(buf_id);
> > -                             PendingCheckpointerStats.m_buf_written_checkpoints++;
> >                               num_written++;
> >                       }
> >               }
> > @@ -2273,9 +2272,6 @@ BgBufferSync(WritebackContext *wb_context)
> >        */
> >       strategy_buf_id = StrategySyncStart(&strategy_passes, &recent_alloc);
> >
> > -     /* Report buffer alloc counts to pgstat */
> > -     PendingBgWriterStats.m_buf_alloc += recent_alloc;
> > -
> >       /*
> >        * If we're not running the LRU scan, just stop after doing the stats
> >        * stuff.  We mark the saved state invalid so that we can recover sanely
> > @@ -2472,8 +2468,6 @@ BgBufferSync(WritebackContext *wb_context)
> >                       reusable_buffers++;
> >       }
> >
> > -     PendingBgWriterStats.m_buf_written_clean += num_written;
> > -
>
> Isn't num_written unused now, unless tracepoints are enabled? I'd expect some
> compilers to warn... Perhaps we should just remove information from the
> tracepoint?

The local variable num_written is used in BgBufferSync() to determine
whether or not to increment maxwritten_clean which is still represented
in the view pg_stat_checkpointer (formerly pg_stat_bgwriter).

A local variable num_written is used in BufferSync() to increment
CheckpointStats.ckpt_bufs_written which is logged in LogCheckpointEnd(),
so I'm not sure that can be removed.

- Melanie

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andrew Dunstan
Дата:
Сообщение: Re: SQL/JSON: functions
Следующее
От: Andrew Dunstan
Дата:
Сообщение: Re: SQL/JSON: functions