Re: Hot Backup with rsync fails at pg_clog if under load

Поиск
Список
Период
Сортировка
От Florian Pflug
Тема Re: Hot Backup with rsync fails at pg_clog if under load
Дата
Msg-id AAE7BBFB-641C-482C-A8D2-0D61E2DF3B09@phlo.org
обсуждение исходный текст
Ответ на Re: Hot Backup with rsync fails at pg_clog if under load  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Список pgsql-hackers
On Oct27, 2011, at 08:57 , Heikki Linnakangas wrote:
> On 27.10.2011 02:29, Florian Pflug wrote:
>> Per my theory about the cause of the problem in my other mail, I think you
>> might see StartupCLOG failures even during crash recovery, provided that
>> wal_level was set to hot_standby when the primary crashed. Here's how
>>
>> 1) We start a checkpoint, and get as far as LogStandbySnapshot()
>> 2) A backend does AssignTransactionId, and gets as far as GetTransactionoId().
>>   The assigned XID requires CLOG extension.
>> 3) The checkpoint continues, and LogStandbySnapshot () advances the
>>   checkpoint's nextXid to the XID assigned in (2).
>> 4) We crash after writing the checkpoint record, but before the CLOG
>>   extension makes it to the disk, and before any trace of the XID assigned
>>   in (2) makes it to the xlog.
>>
>> Then StartupCLOG() would fail at the end of recovery, because we'd end up
>> with a nextXid whose corresponding CLOG page doesn't exist.
>
> No, clog extension is WAL-logged while holding the XidGenLock. At step 3,
> LogStandbySnapshot() would block until the clog-extension record is written
> to WAL, so crash recovery would see and replay that record before calling
> StartupCLOG().

Hm, true. But it still seems wrong for LogStandbySnapshot() to modify the
checkpoint's nextXid, and even more wrong to do that only if wal_mode =
hot_standby. Plus, I think it's a smart idea to verify that the required
parts of the CLOG are available at the start of recovery. Because if they're
missing, the data on the standby *will* be corrupted. Is there any argument
against doiing LogStandbySnapshot() earlier (i.e., at the time oldestActiveXid
is computed)?

best regards,
Florian Pflug




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: Updated version of pg_receivexlog
Следующее
От: Robert Haas
Дата:
Сообщение: Re: isolationtester and invalid permutations