Re: using an end-of-recovery record in all cases

Поиск
Список
Период
Сортировка
От Nathan Bossart
Тема Re: using an end-of-recovery record in all cases
Дата
Msg-id 20220420170224.GA2579385@nathanxps13
обсуждение исходный текст
Ответ на Re: using an end-of-recovery record in all cases  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: using an end-of-recovery record in all cases
Список pgsql-hackers
On Wed, Apr 20, 2022 at 09:26:07AM -0400, Robert Haas wrote:
> I was talking with Thomas Munro yesterday and he thinks there is a
> problem with relfilenode reuse here. In normal running, when a
> relation is dropped, we leave behind a 0-length file until the next
> checkpoint; this keeps that relfilenode from being used even if the
> OID counter wraps around. If we didn't do that, then imagine that
> while running with wal_level=minimal, we drop an existing relation,
> create a new relation with the same OID, load some data into it, and
> crash, all within the same checkpoint cycle, then we will be able to
> replay the drop, but we will not be able to restore the relation
> contents afterward because at wal_level=minimal they are not logged.
> Apparently, we don't create tombstone files during recovery because we
> know that there will be a checkpoint at the end.

In the example you provided, won't the tombstone file already be present
before the crash?  During recovery, the tombstone file will be removed, and
the new relation wouldn't use the same relfilenode anyway.  I'm probably
missing something obvious here.

I do see the problem if we drop an existing relation, crash, reuse the
filenode, and then crash again (all within the same checkpoint cycle).  The
first recovery would remove the tombstone file, and the second recovery
would wipe out the new relation's files.

> With the existing use of the end-of-recovery record, we always know
> that wal_level>minimal, because we're only using it on standbys. But
> with this use that wouldn't be true any more. So I guess we need to
> start creating tombstone files even during recovery, or else do
> something like what Dilip coded up in
> http://postgr.es/m/CAFiTN-u=r8UTCSzu6_pnihYAtwR1=esq5sRegTEZ2tLa92fovA@mail.gmail.com
> which I think would be a better solution at least in the long term.

IMO this would be good just to reduce the branching a bit.  I suppose
removing the files immediately during recovery might be an optimization in
some cases, but I am skeptical that it really makes that much of a
difference in practice.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Fix NULL pointer reference in _outPathTarget()
Следующее
От: Thomas Munro
Дата:
Сообщение: Re: using an end-of-recovery record in all cases