Re: sync request forward function ForwardSyncRequest() might hang for some time in a corner case?

Поиск
Список
Период
Сортировка
От Julien Rouhaud
Тема Re: sync request forward function ForwardSyncRequest() might hang for some time in a corner case?
Дата
Msg-id CAOBaU_b2Jhb683OFJMMPM-hwYiEKbhgS5O-YkyUFDr6EpTAcWg@mail.gmail.com
обсуждение исходный текст
Ответ на sync request forward function ForwardSyncRequest() might hang for some time in a corner case?  (Paul Guo <paulguo@gmail.com>)
Ответы Re: sync request forward function ForwardSyncRequest() might hang for some time in a corner case?  (Paul Guo <paulguo@gmail.com>)
Список pgsql-hackers
On Tue, May 25, 2021 at 4:39 PM Paul Guo <paulguo@gmail.com> wrote:
>
> Hi hackers,
>
> I found this when reading the related code. Here is the scenario:
>
> bool
> RegisterSyncRequest(const FileTag *ftag, SyncRequestType type,
>                     bool retryOnError)
>
> For the case retryOnError is true, the function would in loop call
> ForwardSyncRequest() until it succeeds, but in ForwardSyncRequest(),
> we can see if we run into the below branch, RegisterSyncRequest() will
> need to loop until the checkpointer absorbs the existing requests so
> ForwardSyncRequest() might hang for some time until a checkpoint
> request is triggered. This scenario seems to be possible in theory
> though the chance is not high.

It seems like a really unlikely scenario, but maybe possible if you
use a lot of unlogged tables maybe (as you could eventually
dirty/evict more than NBuffers buffers without triggering enough WALs
activity to trigger a checkpoint with any sane checkpoint
configuration).

> ForwardSyncRequest():
>
>     if (CheckpointerShmem->checkpointer_pid == 0 ||
>         (CheckpointerShmem->num_requests >= CheckpointerShmem->max_requests &&
>          !CompactCheckpointerRequestQueue()))
>     {
>         /*
>          * Count the subset of writes where backends have to do their own
>          * fsync
>          */
>         if (!AmBackgroundWriterProcess())
>             CheckpointerShmem->num_backend_fsync++;
>         LWLockRelease(CheckpointerCommLock);
>         return false;
>     }
>
> One fix is to add below similar code in RegisterSyncRequest(), trigger
> a checkpoint for the scenario.
>
> // checkpointer_triggered: variable for one trigger only.
> if (!ret && retryOnError && ProcGlobal->checkpointerLatch &&
> !checkpointer_triggered)
>         SetLatch(ProcGlobal->checkpointerLatch);
>
> Any comments?

It looks like you intended to set the checkpointer_triggered var but
didn't.  Also this will wake up the checkpointer but won't force a
checkpoint (unlike RequestCheckpoint()).  It may be a good thing
though as it would only absorb the requests and go back to sleep if no
other threshold is reachrf.  Apart from the implementation details it
seems like it could help in this unlikely event.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: "houzj.fnst@fujitsu.com"
Дата:
Сообщение: RE: Parallel Inserts in CREATE TABLE AS
Следующее
От: Marko Tiikkaja
Дата:
Сообщение: security_definer_search_path GUC