Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

Поиск
Список
Период
Сортировка
От Melanie Plageman
Тема Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)
Дата
Msg-id CAAKRu_ZFUkRKP08-L0413+g3XtYCjqJNPT2A7q0eT_otz_ei=g@mail.gmail.com
обсуждение исходный текст
Ответ на Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Ответы Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)  (Andres Freund <andres@anarazel.de>)
Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Список pgsql-hackers
Thanks for the review!

On Tue, Jul 12, 2022 at 4:06 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
At Mon, 11 Jul 2022 22:22:28 -0400, Melanie Plageman <melanieplageman@gmail.com> wrote in
> Hi,
>
> In the attached patch set, I've added in missing IO operations for
> certain IO Paths as well as enumerating in the commit message which IO
> Paths and IO Operations are not currently counted and or not possible.
>
> There is a TODO in HandleWalWriterInterrupts() about removing
> pgstat_report_wal() since it is immediately before a proc_exit()

Right. walwriter does that without needing the explicit call.

I have deleted it.
 

> I was wondering if LocalBufferAlloc() should increment the counter or if
> I should wait until GetLocalBufferStorage() to increment the counter.

Depends on what "allocate" means. Different from shared buffers, local
buffers are taken from OS then allocated to page.  OS-allcoated pages
are restricted by num_temp_buffers so I think what we're interested in
is the count incremented by LocalBuferAlloc(). (And it is the parallel
of alloc for shared-buffers)

I've left it in LocalBufferAlloc().
 

> I also realized that I am not differentiating between IOPATH_SHARED and
> IOPATH_STRATEGY for IOOP_FSYNC. But, given that we don't know what type
> of buffer we are fsync'ing by the time we call register_dirty_segment(),
> I'm not sure how we would fix this.

I think there scarcely happens flush for strategy-loaded buffers.  If
that is sensible, IOOP_FSYNC would not make much sense for
IOPATH_STRATEGY.

Why would it be less likely for a backend to do its own fsync when
flushing a dirty strategy buffer than a regular dirty shared buffer?
 

> > IMO it'd be a good idea to add pgstat_report_io_ops() to
> > pgstat_report_vacuum()/analyze(), so that the stats for a longrunning
> > autovac worker get updated more regularly.
> >
>
> noted and fixed.

> > How about moving the pgstat_report_io_ops() into
> > pgstat_report_bgwriter(), pgstat_report_autovacuum() etc? Seems
> > unnecessary to have multiple pgstat_* calls in these places.
> >
> >
> >
> noted and fixed.

+        * Also report IO Operations statistics

I think that the function comment also should mention this.

I've added comments at the top of all these functions.
 

> > Wonder if it's worth making the lock specific to the backend type?
> >
>
> I've added another Lock into PgStat_IOPathOps so that each BackendType
> can be locked separately. But, I've also kept the lock in
> PgStatShared_BackendIOPathOps so that reset_all and snapshot could be
> done easily.

Looks fine about the lock separation.

Actually, I think it is not safe to use both of these locks. So for
picking one method, it is probably better to go with the locks in
PgStat_IOPathOps, it will be more efficient for flush (and not for
fetching and resetting), so that is probably the way to go, right?
 
By the way, in the following line:

+               &pgStatLocal.shmem->io_ops.stats[backend_type_get_idx(MyBackendType)];

backend_type_get_idx(x) is actually (x - 1) plus assertion on the
value range. And the only use-case is here. There's an reverse
function and also used only at one place.

+               Datum           backend_type_desc =
+                       CStringGetTextDatum(GetBackendTypeDesc(idx_get_backend_type(i)));

In this usage GetBackendTypeDesc() gracefully treats out-of-domain
values but idx_get_backend_type keenly kills the process for the
same. This is inconsistent.

My humbel opinion on this is we don't define the two functions and
replace the calls to them with (x +/- 1).  Addition to that, I think
we should not abort() by invalid backend types.  In that sense, I
wonder if we could use B_INVALIDth element for this purpose.

I think that GetBackendTypeDesc() should probably also error out for an
unknown value.

I would be open to not using the helper functions. I thought it would be
less error-prone, but since it is limited to the code in
pgstat_io_ops.c, it is probably okay. Let me think a bit more.

Could you explain more about what you mean about using B_INVALID
BackendType?
 

> > > +     LWLockAcquire(&stats_shmem->lock, LW_SHARED);
> > > +     memcpy(&reset, reset_offset, sizeof(stats_shmem->stats));
> > > +     LWLockRelease(&stats_shmem->lock);
> >
> > Which then also means that you don't need the reset offset stuff. It's
> > only there because with the changecount approach we can't take a lock to
> > reset the stats (since there is no lock). With a lock you can just reset
> > the shared state.
> >
>
> Yes, I believe I have cleaned up all of this embarrassing mess. I use the
> lock in PgStatShared_BackendIOPathOps for reset all and snapshot and the
> locks in PgStat_IOPathOps for flush.

Looks fine, but I think pgstat_flush_io_ops() need more comments like
other pgstat_flush_* functions.

+       for (int i = 0; i < BACKEND_NUM_TYPES; i++)
+               stats_shmem->stats[i].stat_reset_timestamp = ts;

I'm not sure we need a separate reset timestamp for each backend type
but SLRU counter does the same thing..

Yes, I think for SLRU stats it is because you can reset individual SLRU
stats. Also there is no wrapper data structure to put it in. I could
keep it in PgStatShared_BackendIOPathOps since you have to reset all IO
operation stats at once, but I am thinking of getting rid of
PgStatShared_BackendIOPathOps since it is not needed if I only keep the
locks in PgStat_IOPathOps and make the global shared value an array of
PgStat_IOPathOps.
 

> > > +pgstat_report_io_ops(void)
> > > +{
> >
> > This shouldn't be needed - the flush function above can be used.
> >
>
> Fixed.

The commit message of 0002 contains that name:p

Thanks! Fixed.
 

> > > +const char *
> > > +pgstat_io_path_desc(IOPath io_path)
> > > +{
> > > +     const char *io_path_desc = "Unknown IO Path";
> > > +
> >
> > This should be unreachable, right?
> >
>
> Changed it to an error.

+       elog(ERROR, "Attempt to describe an unknown IOPath");

I think we usually spell it as ("unrecognized IOPath value: %d", io_path).

I have changed to this.
 

> > > From f2b5b75f5063702cbc3c64efdc1e7ef3cf1acdb4 Mon Sep 17 00:00:00 2001
> > > From: Melanie Plageman <melanieplageman@gmail.com>
> > > Date: Mon, 4 Jul 2022 15:44:17 -0400
> > > Subject: [PATCH v22 3/3] Add system view tracking IO ops per backend type
> >
> > > Add pg_stat_buffers, a system view which tracks the number of IO
> > > operations (allocs, writes, fsyncs, and extends) done through each IO
> > > path (e.g. shared buffers, local buffers, unbuffered IO) by each type of
> > > backend.
> >
> > I think I like pg_stat_io a bit better? Nearly everything in here seems
> > to fit better in that.
> >
> > I guess we could split out buffers allocated, but that's actually
> > interesting in the context of the kind of IO too.
> >
>
> changed it to pg_stat_io

A bit different thing, but I felt a little uneasy about some uses of
"pgstat_io_ops". IOOp looks like a neighbouring word of IOPath. On the
other hand, actually iopath is used as an attribute of io_ops in many
places.  Couldn't we be more consistent about the relationship between
the names?

IOOp   -> PgStat_IOOpType
IOPath -> PgStat_IOPath
PgStat_IOOpCOonters -> PgStat_IOCounters
PgStat_IOPathOps    -> PgStat_IO
pgstat_count_io_op  -> pgstat_count_io
...

(Better wordings are welcome.)

Let me think about naming and make changes in the next version.
 
> > Would be nice to have something testing that the ringbuffer stats stuff
> > does something sensible - that feels not entirely trivial.
> >
> >
> I've added a test to test that reused strategy buffers are counted as
> allocs. I would like to add a test which checks that if a buffer in the
> ring is pinned and thus not reused, that it is not counted as a strategy
> alloc, but I found it challenging without a way to pause vacuuming, pin
> a buffer, then resume vacuuming.

===

If I'm not missing something, in BufferAlloc, when strategy is not
used and the victim is dirty, iopath is determined based on the
uninitialized from_ring.  It seems to me from_ring is equivalent to
strategy_current_was_in_ring.  And if StrategyGetBuffer has set
from_ring to false, StratetgyRejectBuffer may set it to true, which is
is wrong. The logic around there seems to need a rethink.

What can we read from the values separated to Shared and Strategy?

I have changed this local variable to only be used for communicating if
the buffer which was not rejected by StrategyRejectBuffer() was from the
ring or not for the purposes of counting strategy writes. I could add an
accessor for this member (strategy->current_was_in_ring) if that makes
more sense? For strategy allocs, I just use
strategy->current_was_in_ring inside of StrategyGetBuffer() since this
has access to that member of the struct.

Currently, strategy allocs count only reuses of a strategy buffer (not
initial shared buffers which are added to the ring).
strategy writes count only the writing out of dirty buffers which are
already in the ring and are being reused.

Alternatively, we could also count as strategy allocs all those buffers
which are added to the ring and count as strategy writes  all those
shared buffers which are dirty when initially added to the ring.

- Melanie
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Dilip Kumar
Дата:
Сообщение: Re: making relfilenodes 56 bits
Следующее
От: Andres Freund
Дата:
Сообщение: Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)