Re: [HACKERS] Unlogged tables cleanup

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: [HACKERS] Unlogged tables cleanup
Дата
Msg-id CA+TgmoarmtbAPFj=tCT4Tm4LqQnwHxJRFCSG2Bm=m6cE-nc=fQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] Unlogged tables cleanup  (Andres Freund <andres@anarazel.de>)
Ответы Re: [HACKERS] Unlogged tables cleanup  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Re: [HACKERS] Unlogged tables cleanup  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
On Mon, May 13, 2019 at 12:50 PM Andres Freund <andres@anarazel.de> wrote:
> > AFAICS ResetUnloggedRelations copies the init fork after replaying WAL,
> > so it would be sufficient to have the init fork be recovered from WAL
> > for that to work.  However, we also do ResetUnloggedRelations *before*
> > replaying WAL in order to remove leftover not-init-fork files, and that
> > process requires that the init fork is present at that time.
>
> What scenario are you precisely wondering about? That
> ResetUnloggedRelations() could overwrite the main fork, while not yet
> having valid contents (due to the lack of smgrimmedsync())? Shouldn't
> that only be possible while still in an inconsistent state? A checkpoint
> would have serialized the correct contents, and we'd not reach HS
> consistency before having replayed that WAL records resetting the table
> and the init fork consistency?

I think I see what Alvaro is talking about, or at least I think I see
*a* possible problem based on his remarks.

Suppose we create an unlogged table and then crash. The main fork
makes it to disk, and the init fork does not.  Before WAL replay, we
remove any main forks that have init forks, but because the init fork
was lost, that does not happen.  Recovery recreates the init fork.
After WAL replay, we try to copy_file() each _init fork to the
corresponding main fork. That fails, because copy_file() expects to be
able to create the target file, and here it can't do that because it
already exists.

If that's the scenario, I'm not sure the smgrimmedsync() call is
sufficient.  Suppose we log_smgrcreate() but then crash before
smgrimmedsync()... seems like we'd need to do them in the other order,
or else maybe just pass a flag to copy_file() telling it not to be so
picky.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: [HACKERS] Unlogged tables cleanup
Следующее
От: Robert Haas
Дата:
Сообщение: Re: att_isnull