Re: pg_stat_io for the startup process

Поиск
Список
Период
Сортировка
От Melanie Plageman
Тема Re: pg_stat_io for the startup process
Дата
Msg-id CAAKRu_YLBELwc7T6F_BFV_oKOzZNx2X+wbRiuR7U4FT-0-caCg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: pg_stat_io for the startup process  (Melih Mutlu <m.melihmutlu@gmail.com>)
Ответы Re: pg_stat_io for the startup process  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Список pgsql-hackers
On Wed, Apr 26, 2023 at 2:53 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
>
> Hi all,
>
> Robert Haas <robertmhaas@gmail.com>, 26 Nis 2023 Çar, 15:34 tarihinde şunu yazdı:
>>
>> On Wed, Apr 26, 2023 at 5:47 AM Kyotaro Horiguchi
>> <horikyota.ntt@gmail.com> wrote:
>> > 3. When should we call pgstat_report_stats on the startup process?
>> >
>> > During recovery, I think we can call pgstat_report_stats() (or a
>> > subset of it) right before invoking WaitLatch and at segment
>> > boundaries.
>>
>> I think this kind of idea is worth exploring. Andres mentioned timers,
>> but it seems to me that something where we just do it at certain
>> "convenient" points might be good enough and simpler. I'd much rather
>> have statistics that were up to date as of the last time we finished a
>> segment than have nothing at all.
>
>
> I created a rough prototype of a timer-based approach for comparison. Please see attached.
>
> Registered a new timeout named "STARTUP_STAT_FLUSH_TIMEOUT", The timeout's
> handler sets a flag indicating that io stats need to be flushed.
> HandleStartupProcInterrupts checks the flag to flush stats. It's enabled if
> any WAL record is read (in the main loop of PerformWalRecovery). And it's
> disabled before WaitLatch to avoid unnecessary wakeups if the startup process
> decides to sleep.

I was reviewing this and found that if I remove the calls to
disable_startup_stat_flush_timeout() the number of calls to
pgstat_report_wal() on a briefly active and then idle standby are about
8 in 100 seconds whereas with timer disablement, the calls over the same
period are about 40. I would have thought that disabling the timer would
have caused us to try and flush the stats less often.

With the calls to disable_startup_stat_flush_timeout(), we do have much
fewer calls to setitimer(), of course.

Given the below suggestion by Andres, I tried doing some traces of the
various approaches.

> > +       disable_startup_stat_flush_timeout();
> > +
> >         (void) WaitLatch(&XLogRecoveryCtl->recoveryWakeupLatch,
> >                         WL_LATCH_SET | WL_TIMEOUT |
> >                         WL_EXIT_ON_PM_DEATH,
>
> I think always disabling the timer here isn't quite right - we want to wake up
> *once* in WaitForWALToBecomeAvailable(), otherwise we'll not submit pending
> stats before waiting - potentially for a long time - for WAL. One way would be
> just explicitly report before the WaitLatch().
>
> Another approach, I think better, would be to not use enable_timeout_every(),
> and to explicitly rearm the timer in HandleStartupProcInterrupts(). When
> called from WaitForWALToBecomeAvailable(), we'd not rearm, and instead do so
> at the end of WaitForWALToBecomeAvailable().  That way we also wouldn't
> repeatedly fire between calls to HandleStartupProcInterrupts().

After a quick example implementation of this, I found that it seemed to
try and flush the stats less often on an idle standby (good) than using
enable_timeout_every().

- Melanie



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Daniel Gustafsson
Дата:
Сообщение: Re: Should vacuum process config file reload more often
Следующее
От: Nathan Bossart
Дата:
Сообщение: Re: Possible regression setting GUCs on \connect