Re: POC: Cleaning up orphaned files using undo logs

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: POC: Cleaning up orphaned files using undo logs
Дата
Msg-id CAA4eK1+UtutcnUY4LgfS_ndA81tEDr5F67WVFixLYejGObW0Og@mail.gmail.com
обсуждение исходный текст
Ответ на Re: POC: Cleaning up orphaned files using undo logs  (Antonin Houska <ah@cybertec.at>)
Ответы Re: POC: Cleaning up orphaned files using undo logs  (Antonin Houska <ah@cybertec.at>)
Список pgsql-hackers
On Fri, Sep 24, 2021 at 4:44 PM Antonin Houska <ah@cybertec.at> wrote:
>
> Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> > On Mon, Sep 20, 2021 at 10:24 AM Antonin Houska <ah@cybertec.at> wrote:
> > >
> > > Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > > On Fri, Sep 17, 2021 at 9:50 PM Dmitry Dolgov <9erthalion6@gmail.com> wrote:
> > > > >
> > > > > > On Tue, Sep 14, 2021 at 10:51:42AM +0200, Antonin Houska wrote:
> > > > >
> > > > > * What happened with the idea of abandoning discard worker for the sake
> > > > >   of simplicity? From what I see limiting everything to foreground undo
> > > > >   could reduce the core of the patch series to the first four patches
> > > > >   (forgetting about test and docs, but I guess it would be enough at
> > > > >   least for the design review), which is already less overwhelming.
>
> > > What we can miss, at least for the cleanup of the orphaned files, is the *undo
> > > worker*. In this patch series the cleanup is handled by the startup process.
> > >
> >
> > Okay, I think various people at different point of times has suggested
> > that idea. I think one thing we might need to consider is what to do
> > in case of a FATAL error? In case of FATAL error, it won't be
> > advisable to execute undo immediately, so would we upgrade the error
> > to PANIC in such cases. I remember vaguely that for clean up of
> > orphaned files that can happen rarely and someone has suggested
> > upgrading the error to PANIC in such a case but I don't remember the
> > exact details.
>
> Do you mean FATAL error during normal operation?
>

Yes.

> As far as I understand, even
> zheap does not rely on immediate UNDO execution (otherwise it'd never
> introduce the undo worker), so FATAL only means that the undo needs to be
> applied later so it can be discarded.
>

Yeah, zheap either applies undo later via background worker or next
time before dml operation if there is a need.

> As for the orphaned files cleanup feature with no undo worker, we might need
> PANIC to ensure that the undo is applied during restart and that it can be
> discarded, otherwise the unapplied undo log would stay there until the next
> (regular) restart and it would block discarding. However upgrading FATAL to
> PANIC just because the current transaction created a table seems quite
> rude.
>

True, I guess but we can once see in what all scenarios it can
generate FATAL during that operation.

> So the undo worker might be needed even for this patch?
>

I think we can keep undo worker as a separate patch and for base patch
keep the idea of promoting FATAL to PANIC. This will at the very least
make the review easier.

> Or do you mean FATAL error when executing the UNDO?
>

No.

-- 
With Regards,
Amit Kapila.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: row filtering for logical replication
Следующее
От: Amit Kapila
Дата:
Сообщение: Re: Skipping logical replication transactions on subscriber side