Re: O(n) tasks cause lengthy startups and checkpoints

Поиск
Список
Период
Сортировка
От Nathan Bossart
Тема Re: O(n) tasks cause lengthy startups and checkpoints
Дата
Msg-id 20220217225838.GA3292179@nathanxps13
обсуждение исходный текст
Ответ на Re: O(n) tasks cause lengthy startups and checkpoints  (Andres Freund <andres@anarazel.de>)
Ответы Re: O(n) tasks cause lengthy startups and checkpoints  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
On Thu, Feb 17, 2022 at 02:28:29PM -0800, Andres Freund wrote:
> As far as I understand, the primary concern are logical decoding serialized
> snapshots, because a lot of them can accumulate if there e.g. is an old unused
> / far behind slot. It should be easy to reduce the number of those snapshots
> by e.g. eliding some redundant ones. Perhaps we could also make backends in
> logical decoding occasionally do a bit of cleanup themselves.
> 
> I've not seen reports of the number of mapping files to be an real issue?

I routinely see all four of these tasks impacting customers, but I'd say
the most common one is the temporary file cleanup.  Besides eliminating
some redundant files and having backends perform some cleanup, what do you
think about skipping the logical decoding cleanup during
end-of-recovery/shutdown checkpoints?  This was something that Bharath
brought up a while back [0].  As I noted in that thread, startup and
shutdown could still take a while if checkpoints are regularly delayed due
to logical decoding cleanup, but that might still help avoid a bit of
downtime.

> The improvements around deleting temporary files and serialized snapshots
> afaict don't require a dedicated process - they're only relevant during
> startup. We could use the approach of renaming the directory out of the way as
> done in this patchset but perform the cleanup in the startup process after
> we're up.

Perhaps this is a good place to start.  As I mentioned above, IME the
temporary file cleanup is the most common problem, so I think even getting
that one fixed would be a huge improvement.

[0] https://postgr.es/m/CALj2ACXkkSL8EBpR7m%3DMt%3DyRGBhevcCs3x4fsp3Bc-D13yyHOg%40mail.gmail.com

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: O(n) tasks cause lengthy startups and checkpoints
Следующее
От: Robert Haas
Дата:
Сообщение: Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints