Re: O(n) tasks cause lengthy startups and checkpoints

Поиск

Список

Период

Сортировка

От	Euler Taveira
Тема	Re: O(n) tasks cause lengthy startups and checkpoints
Дата	2 декабря 2021 г. 05:05:03
Msg-id	d4c7e393-b8be-4879-ad4b-4e80994e5763@www.fastmail.com обсуждение исходный текст
Ответ на	Re: O(n) tasks cause lengthy startups and checkpoints ("Bossart, Nathan" <bossartn@amazon.com>)
Ответы	Re: O(n) tasks cause lengthy startups and checkpoints ("Bossart, Nathan" <bossartn@amazon.com>)
Список	pgsql-hackers

Дерево обсуждения

On Wed, Dec 1, 2021, at 9:19 PM, Bossart, Nathan wrote:

On 12/1/21, 2:56 PM, "Andres Freund" <andres@anarazel.de> wrote:
> On 2021-12-01 20:24:25 +0000, Bossart, Nathan wrote:
>> I realize adding a new maintenance worker might be a bit heavy-handed,
>> but I think it would be nice to have somewhere to offload tasks that
>> really shouldn't impact startup and checkpointing. I imagine such a
>> process would come in handy down the road, too. WDYT?
>
> -1. I think the overhead of an additional worker is disproportional here. And
> there's simplicity benefits in having a predictable cleanup interlock as well.

Another idea I had was to put some upper limit on how much time is
spent on such tasks. For example, a checkpoint would only spend X
minutes on CheckPointSnapBuild() before giving up until the next one.
I think the main downside of that approach is that it could lead to
unbounded growth, so perhaps we would limit (or even skip) such tasks
only for end-of-recovery and shutdown checkpoints. Perhaps the
startup tasks could be limited in a similar fashion.

Saying that a certain task is O(n) doesn't mean it needs a separate process to

handle it. Did you have a use case or even better numbers (% of checkpoint /

startup time) that makes your proposal worthwhile?

I would try to optimize (1) and (2). However, delayed removal can be a

long-term issue if the new routine cannot keep up with the pace of file

creation (specially if the checkpoints are far apart).

For (3), there is already a GUC that would avoid the slowdown during startup.

Use it if you think the startup time is more important that disk space occupied

by useless files.

For (4), you are forgetting that the on-disk state of replication slots is

stored in the pg_replslot/SLOTNAME/state. It seems you cannot just rename the

replication slot directory and copy the state file. What happen if there is a

crash before copying the state file?

While we are talking about items (1), (2) and (4), we could probably have an

option to create some ephemeral logical decoding files into ramdisk (similar to

statistics directory). I wouldn't like to hijack this thread but this proposal

could alleviate the possible issues that you pointed out. If people are

interested in this proposal, I can start a new thread about it.

Euler Taveira

EDB https://www.enterprisedb.com/

В списке pgsql-hackers по дате отправления:

Предыдущее

От: "osumi.takamichi@fujitsu.com"
Дата: 02 декабря 2021 г., 04:05:16
Сообщение: RE: Optionally automatically disable logical replication subscriptions on error

Следующее

От: Amit Kapila
Дата: 02 декабря 2021 г., 05:33:31
Сообщение: Re: Data is copied twice when specifying both child and parent table in publication

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: O(n) tasks cause lengthy startups and checkpoints

Предыдущее

Следующее