Re: sync request forward function ForwardSyncRequest() might hang for some time in a corner case?

Поиск
Список
Период
Сортировка
От Paul Guo
Тема Re: sync request forward function ForwardSyncRequest() might hang for some time in a corner case?
Дата
Msg-id CABQrizf0Ji-VnLGokbfGSxLM2R+j0d4kShAh+tVw1r-1M-MY-g@mail.gmail.com
обсуждение исходный текст
Ответ на Re: sync request forward function ForwardSyncRequest() might hang for some time in a corner case?  (Julien Rouhaud <rjuju123@gmail.com>)
Ответы Re: sync request forward function ForwardSyncRequest() might hang for some time in a corner case?  (Paul Guo <paulguo@gmail.com>)
Re: sync request forward function ForwardSyncRequest() might hang for some time in a corner case?  (Julien Rouhaud <rjuju123@gmail.com>)
Список pgsql-hackers
On Thu, May 27, 2021 at 7:11 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> On Tue, May 25, 2021 at 4:39 PM Paul Guo <paulguo@gmail.com> wrote:
> >
> > Hi hackers,
> >
> > I found this when reading the related code. Here is the scenario:
> >
> > bool
> > RegisterSyncRequest(const FileTag *ftag, SyncRequestType type,
> >                     bool retryOnError)
> >
> > For the case retryOnError is true, the function would in loop call
> > ForwardSyncRequest() until it succeeds, but in ForwardSyncRequest(),
> > we can see if we run into the below branch, RegisterSyncRequest() will
> > need to loop until the checkpointer absorbs the existing requests so
> > ForwardSyncRequest() might hang for some time until a checkpoint
> > request is triggered. This scenario seems to be possible in theory
> > though the chance is not high.
>
> It seems like a really unlikely scenario, but maybe possible if you
> use a lot of unlogged tables maybe (as you could eventually
> dirty/evict more than NBuffers buffers without triggering enough WALs
> activity to trigger a checkpoint with any sane checkpoint
> configuration).

RegisterSyncRequest() handles SYNC_UNLINK_REQUEST and
SYNC_FORGET_REQUEST scenarios, besides the usual SYNC_REQUEST type for
buffer sync.

> > ForwardSyncRequest():
> >
> >     if (CheckpointerShmem->checkpointer_pid == 0 ||
> >         (CheckpointerShmem->num_requests >= CheckpointerShmem->max_requests &&
> >          !CompactCheckpointerRequestQueue()))
> >     {
> >         /*
> >          * Count the subset of writes where backends have to do their own
> >          * fsync
> >          */
> >         if (!AmBackgroundWriterProcess())
> >             CheckpointerShmem->num_backend_fsync++;
> >         LWLockRelease(CheckpointerCommLock);
> >         return false;
> >     }
> >
> > One fix is to add below similar code in RegisterSyncRequest(), trigger
> > a checkpoint for the scenario.
> >
> > // checkpointer_triggered: variable for one trigger only.
> > if (!ret && retryOnError && ProcGlobal->checkpointerLatch &&
> > !checkpointer_triggered)
> >         SetLatch(ProcGlobal->checkpointerLatch);
> >
> > Any comments?
>
> It looks like you intended to set the checkpointer_triggered var but

Yes this is just pseduo code.

> didn't.  Also this will wake up the checkpointer but won't force a
> checkpoint (unlike RequestCheckpoint()).  It may be a good thing

I do not expect an immediate checkpoint. AbsorbSyncRequests()
is enough since after that RegisterSyncRequest() could finish.

> though as it would only absorb the requests and go back to sleep if no
> other threshold is reachrf.  Apart from the implementation details it
> seems like it could help in this unlikely event.



-- 
Paul Guo (Vmware)



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Paul Guo
Дата:
Сообщение: Re: pg_rewind fails if there is a read only file.
Следующее
От: Paul Guo
Дата:
Сообщение: Re: sync request forward function ForwardSyncRequest() might hang for some time in a corner case?