Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)
От | Melanie Plageman |
---|---|
Тема | Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?) |
Дата | |
Msg-id | CAAKRu_YJMHJ66-ZB86zFCH=Cq4+5w2gnJ14bu8oyZec0C39Dcg@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?) (Andres Freund <andres@anarazel.de>) |
Ответы |
Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)
(Andres Freund <andres@anarazel.de>)
|
Список | pgsql-hackers |
On Fri, Oct 8, 2021 at 1:56 PM Andres Freund <andres@anarazel.de> wrote: > > Hi, > > On 2021-10-01 16:05:31 -0400, Melanie Plageman wrote: > > From 40c809ad1127322f3462e85be080c10534485f0d Mon Sep 17 00:00:00 2001 > > From: Melanie Plageman <melanieplageman@gmail.com> > > Date: Fri, 24 Sep 2021 17:39:12 -0400 > > Subject: [PATCH v13 1/4] Allow bootstrap process to beinit > > > > --- > > src/backend/utils/init/postinit.c | 3 +-- > > 1 file changed, 1 insertion(+), 2 deletions(-) > > > > diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c > > index 78bc64671e..fba5864172 100644 > > --- a/src/backend/utils/init/postinit.c > > +++ b/src/backend/utils/init/postinit.c > > @@ -670,8 +670,7 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username, > > EnablePortalManager(); > > > > /* Initialize status reporting */ > > - if (!bootstrap) > > - pgstat_beinit(); > > + pgstat_beinit(); > > > > /* > > * Load relcache entries for the shared system catalogs. This must create > > -- > > 2.27.0 > > > > I think it's good to remove more and more of these !bootstrap cases - they > really make it harder to understand the state of the system at various > points. Optimizing for the rarely executed bootstrap mode at the cost of > checks in very common codepaths... What scope do you suggest for this patch set? A single patch which does this in more locations (remove !bootstrap) or should I remove this patch from the patchset? > > > > > From a709ddb30b2b747beb214f0b13cd1e1816094e6b Mon Sep 17 00:00:00 2001 > > From: Melanie Plageman <melanieplageman@gmail.com> > > Date: Thu, 30 Sep 2021 16:16:22 -0400 > > Subject: [PATCH v13 2/4] Add utility to make tuplestores for pg stat views > > > > Most of the steps to make a tuplestore for those pg_stat views requiring > > one are the same. Consolidate them into a single helper function for > > clarity and to avoid bugs. > > --- > > src/backend/utils/adt/pgstatfuncs.c | 129 ++++++++++------------------ > > 1 file changed, 44 insertions(+), 85 deletions(-) > > > > diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c > > index ff5aedc99c..513f5aecf6 100644 > > --- a/src/backend/utils/adt/pgstatfuncs.c > > +++ b/src/backend/utils/adt/pgstatfuncs.c > > @@ -36,6 +36,42 @@ > > > > #define HAS_PGSTAT_PERMISSIONS(role) (is_member_of_role(GetUserId(), ROLE_PG_READ_ALL_STATS) || has_privs_of_role(GetUserId(),role)) > > > > +/* > > + * Helper function for views with multiple rows constructed from a tuplestore > > + */ > > +static Tuplestorestate * > > +pg_stat_make_tuplestore(FunctionCallInfo fcinfo, TupleDesc *tupdesc) > > +{ > > + Tuplestorestate *tupstore; > > + ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo; > > + MemoryContext per_query_ctx; > > + MemoryContext oldcontext; > > + > > + /* check to see if caller supports us returning a tuplestore */ > > + if (rsinfo == NULL || !IsA(rsinfo, ReturnSetInfo)) > > + ereport(ERROR, > > + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), > > + errmsg("set-valued function called in context that cannot accept a set"))); > > + if (!(rsinfo->allowedModes & SFRM_Materialize)) > > + ereport(ERROR, > > + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), > > + errmsg("materialize mode required, but it is not allowed in this context"))); > > + > > + /* Build a tuple descriptor for our result type */ > > + if (get_call_result_type(fcinfo, NULL, tupdesc) != TYPEFUNC_COMPOSITE) > > + elog(ERROR, "return type must be a row type"); > > + > > + per_query_ctx = rsinfo->econtext->ecxt_per_query_memory; > > + oldcontext = MemoryContextSwitchTo(per_query_ctx); > > + > > + tupstore = tuplestore_begin_heap(true, false, work_mem); > > + rsinfo->returnMode = SFRM_Materialize; > > + rsinfo->setResult = tupstore; > > + rsinfo->setDesc = *tupdesc; > > + MemoryContextSwitchTo(oldcontext); > > + return tupstore; > > +} > > Is pgstattuple the best place for this helper? It's not really pgstatfuncs > specific... > > It also looks vaguely familiar - I wonder if we have a helper roughly like > this somewhere else already... > I don't see a function which is specifically a utility to make a tuplestore. Looking at the callers of tuplestore_begin_heap(), I notice very similar code to the function I added in pg_tablespace_databases() in utils/adt/misc.c, pg_stop_backup_v2() in xlogfuncs.c, pg_event_trigger_dropped_objects() and pg_event_trigger_ddl_commands in event_tigger.c, pg_available_extensions in extension.c, etc. Do you think it makes sense to refactor this code out of all of these places? If so, where would such a utility function belong? > > > > From e9a5d2a021d429fdbb2daa58ab9d75a069f334d4 Mon Sep 17 00:00:00 2001 > > From: Melanie Plageman <melanieplageman@gmail.com> > > Date: Wed, 29 Sep 2021 15:39:45 -0400 > > Subject: [PATCH v13 3/4] Add system view tracking IO ops per backend type > > > > > diff --git a/src/backend/postmaster/checkpointer.c b/src/backend/postmaster/checkpointer.c > > index be7366379d..0d18e7f71a 100644 > > --- a/src/backend/postmaster/checkpointer.c > > +++ b/src/backend/postmaster/checkpointer.c > > @@ -1104,6 +1104,7 @@ ForwardSyncRequest(const FileTag *ftag, SyncRequestType type) > > */ > > if (!AmBackgroundWriterProcess()) > > CheckpointerShmem->num_backend_fsync++; > > + pgstat_inc_ioop(IOOP_FSYNC, IOPATH_SHARED); > > LWLockRelease(CheckpointerCommLock); > > return false; > > } > > ISTM this doens't need to happen while holding CheckpointerCommLock? > Fixed in attached updates. I only attached the diff from my previous version. > > > > @@ -1461,7 +1467,25 @@ pgstat_reset_shared_counters(const char *target) > > errhint("Target must be \"archiver\", \"bgwriter\", or \"wal\"."))); > > > > pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_RESETSHAREDCOUNTER); > > - pgstat_send(&msg, sizeof(msg)); > > + > > + if (msg.m_resettarget == RESET_BUFFERS) > > + { > > + int backend_type; > > + PgStatIOPathOps ops[BACKEND_NUM_TYPES]; > > + > > + memset(ops, 0, sizeof(ops)); > > + pgstat_report_live_backend_io_path_ops(ops); > > + > > + for (backend_type = 1; backend_type < BACKEND_NUM_TYPES; backend_type++) > > + { > > + msg.m_backend_resets.backend_type = backend_type; > > + memcpy(&msg.m_backend_resets.iop, &ops[backend_type], sizeof(msg.m_backend_resets.iop)); > > + pgstat_send(&msg, sizeof(msg)); > > + } > > + } > > + else > > + pgstat_send(&msg, sizeof(msg)); > > + > > } > > I'd perhaps put this in a small helper function. > Done. > > > /* ---------- > > * pgstat_fetch_stat_dbentry() - > > @@ -2999,6 +3036,14 @@ pgstat_shutdown_hook(int code, Datum arg) > > { > > Assert(!pgstat_is_shutdown); > > > > + /* > > + * Only need to send stats on IO Ops for IO Paths when a process exits, as > > + * pg_stat_get_buffers() will read from live backends' PgBackendStatus and > > + * then sum this with totals from exited backends persisted by the stats > > + * collector. > > + */ > > + pgstat_send_buffers(); > > + > > /* > > * If we got as far as discovering our own database ID, we can report what > > * we did to the collector. Otherwise, we'd be sending an invalid > > @@ -3092,6 +3137,30 @@ pgstat_send(void *msg, int len) > > #endif > > } > > I think it might be nicer to move pgstat_beshutdown_hook() to be a > before_shmem_exit(), and do this in there. > Not really sure the correct way to do this. A cursory attempt to do so failed because ShutdownXLOG() is also registered as a before_shmem_exit() and ends up being called after pgstat_beshutdown_hook(). pgstat_beshutdown_hook() zeroes out PgBackendStatus, ShutdownXLOG() initiates a checkpoint, and during a checkpoint, the checkpointer increments IO op counter for writes in its PgBackendStatus. > > > +/* > > + * Add live IO Op stats for all IO Paths (e.g. shared, local) to those in the > > + * equivalent stats structure for exited backends. Note that this adds and > > + * doesn't set, so the destination stats structure should be zeroed out by the > > + * caller initially. This would commonly be used to transfer all IO Op stats > > + * for all IO Paths for a particular backend type to the pgstats structure. > > + */ > > This seems a bit odd. Why not zero it in here? Perhaps it also should be > called something like _sum_ instead of _add_? > I wanted to be able to use the function both when it was setting the values and when it needed to add to the values (which are the two current callers). I have changed the name from add -> sum. > > > +void > > +pgstat_add_io_path_ops(PgStatIOOps *dest, IOOps *src, int io_path_num_types) > > +{ > > Why is io_path_num_types a parameter? > I imagined that maybe another caller would want to only add some IO path types and still use the function, but I think it is more confusing than anything else so I've changed it. > > > +static void > > +pgstat_recv_io_path_ops(PgStat_MsgIOPathOps *msg, int len) > > +{ > > + int io_path; > > + PgStatIOOps *src_io_path_ops = msg->iop.io_path_ops; > > + PgStatIOOps *dest_io_path_ops = > > + globalStats.buffers.ops[msg->backend_type].io_path_ops; > > + > > + for (io_path = 0; io_path < IOPATH_NUM_TYPES; io_path++) > > + { > > + PgStatIOOps *src = &src_io_path_ops[io_path]; > > + PgStatIOOps *dest = &dest_io_path_ops[io_path]; > > + > > + dest->allocs += src->allocs; > > + dest->extends += src->extends; > > + dest->fsyncs += src->fsyncs; > > + dest->writes += src->writes; > > + } > > +} > > Could this, with a bit of finessing, use pgstat_add_io_path_ops()? > I didn't really see a good way to do this -- given that pgstat_add_io_path_ops() adds IOOps members to PgStatIOOps members -- which requires a pg_atomic_read_u64() and pgstat_recv_io_path_ops adds PgStatIOOps to PgStatIOOps which doesn't require pg_atomic_read_u64(). Maybe I could pass a flag which, based on the type, either does or doesn't use pg_atomic_read_u64 to access the value? But that seems worse to me. > > > --- a/src/backend/storage/buffer/bufmgr.c > > +++ b/src/backend/storage/buffer/bufmgr.c > > What about writes originating in like FlushRelationBuffers()? > Yes, I have made IOPath a parameter to FlushBuffer() so that it can distinguish between strategy buffer writes and shared buffer writes and then pushed pgstat_inc_ioop() into FlushBuffer(). > > > bool > > -StrategyRejectBuffer(BufferAccessStrategy strategy, BufferDesc *buf) > > +StrategyRejectBuffer(BufferAccessStrategy strategy, BufferDesc *buf, bool *from_ring) > > { > > + /* > > + * If we decide to use the dirty buffer selected by StrategyGetBuffer(), > > + * then ensure that we count it as such in pg_stat_buffers view. > > + */ > > + *from_ring = true; > > + > > Absolutely minor nitpick: Somehow it feelsoff to talk about the view here. Fixed. > > > > +PgBackendStatus * > > +pgstat_fetch_backend_statuses(void) > > +{ > > + return BackendStatusArray; > > +} > > Hm, not sure this adds much? Is there a better way to access the whole BackendStatusArray from within pgstatfuncs.c? > > > > + /* > > + * Subtract 1 from backend_type to avoid having rows for B_INVALID > > + * BackendType > > + */ > > + int rownum = (beentry->st_backendType - 1) * IOPATH_NUM_TYPES + io_path; > > > Perhaps worth wrapping this in a macro or inline function? It's repeated and nontrivial. > Done. > > > + /* Add stats from all exited backends */ > > + backend_io_path_ops = pgstat_fetch_exited_backend_buffers(); > > It's probably *not* worth it, but I do wonder if we should do the addition on the SQL > level, and actually have two functions, one returning data for exited > backends, and one for currently connected ones. > It would be easy enough to implement. I would defer to others on whether or not this would be useful. My use case for pg_stat_buffers() is to see what backends' IO during a benchmark or test workload. For that, I reset the stats before and then query pg_stat_buffers after running the benchmark. I don't know if I would use exited and live stats individually. In a real workload, I could see using pg_stat_buffers live and exited to see if the workload causing lots of backends to do their own writes is ongoing. Though a given workload may be composed of lots of different queries, with backends exiting throughout. > > > +static inline void > > +pgstat_inc_ioop(IOOp io_op, IOPath io_path) > > +{ > > + IOOps *io_ops; > > + PgBackendStatus *beentry = MyBEEntry; > > + > > + Assert(beentry); > > + > > + io_ops = &beentry->io_path_stats[io_path]; > > + switch (io_op) > > + { > > + case IOOP_ALLOC: > > + pg_atomic_write_u64(&io_ops->allocs, > > + pg_atomic_read_u64(&io_ops->allocs) + 1); > > + break; > > + case IOOP_EXTEND: > > + pg_atomic_write_u64(&io_ops->extends, > > + pg_atomic_read_u64(&io_ops->extends) + 1); > > + break; > > + case IOOP_FSYNC: > > + pg_atomic_write_u64(&io_ops->fsyncs, > > + pg_atomic_read_u64(&io_ops->fsyncs) + 1); > > + break; > > + case IOOP_WRITE: > > + pg_atomic_write_u64(&io_ops->writes, > > + pg_atomic_read_u64(&io_ops->writes) + 1); > > + break; > > + } > > +} > > IIRC Thomas Munro had a patch adding a nonatomic_add or such > somewhere. Perhaps in the recovery readahead thread? Might be worth using > instead? > I've added Thomas' function in a separate commit. I looked for a better place to add it (I was thinking somewhere in src/backend/utils/misc) but couldn't find anywhere that made sense. I also added a call to pgstat_inc_ioop() in ProcessSyncRequests() to capture when the checkpointer does fsyncs. I also added pgstat_inc_ioop() calls to callers of smgrwrite() flushing local buffers. I don't know if that is desirable or not in this patch. They could be removed if wrappers for smgrwrite() go in and pgstat_inc_ioop() can be called from within those wrappers. - Melanie
Вложения
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Mark DilgerДата:
Сообщение: Re: BUG #17212: pg_amcheck fails on checking temporary relations
Следующее
От: Tom LaneДата:
Сообщение: Re: pgsql: Adjust configure to insist on Perl version >= 5.8.3.