Re: crash-safe visibility map, take three

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: crash-safe visibility map, take three
Дата
Msg-id AANLkTinZDakWkLQ9D=b7hrBB1A3bE2pPj1NoyT=KusyJ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: crash-safe visibility map, take three  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Ответы Re: crash-safe visibility map, take three  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Список pgsql-hackers
On Tue, Nov 30, 2010 at 2:34 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> Some care is needed with checkpoints. Setting visibility map bits in step 2
> is safe because crash recovery will replay the intent XLOG record and clear
> any incorrectly set bits. But if a checkpoint has happened after the intent
> XLOG record was written, that's not true. This can be avoided by checking
> RedoRecPtr in step 2, and writing a new intent XLOG record if it has changed
> since the last intent XLOG record was written.

It seems like you'll need to hold some kind of lock between the time
you examine RedoRecPtr and the time you actually examine the bit.
WALInsertLock in shared mode, maybe?

> There's a small race condition in the way a visibility map bit is currently
> cleared. When a heap page is updated, it is locked, the update is
> WAL-logged, and the lock is released. The visibility map page is updated
> only after that. If the final vacuum XLOG record is written just after
> updating the heap page, but before the visibility map bit is cleared,
> replaying the final XLOG record will set a bit that should not have been
> set.

Well, if that final XLOG record isn't necessary for correctness
anyway, the obvious thing to do seems to be - don't write it.  Crashes
are not so common that loss of even a full hour's visibility map bits
in the event that we have one seems worth killing ourselves over.  And
not everybody sets checkpoint_timeout to an hour, and not all
checkpoints are triggered by checkpoint_timeout, and not all crashes
happen just before it expires.  Seems like we might be better off
writing that much less WAL.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Heikki Linnakangas
Дата:
Сообщение: Re: GiST insert algorithm rewrite
Следующее
От: Tom Lane
Дата:
Сообщение: Re: crash-safe visibility map, take three