RE: Proposed WAL changes

Поиск
Список
Период
Сортировка
От Mikheev, Vadim
Тема RE: Proposed WAL changes
Дата
Msg-id 8F4C99C66D04D4118F580090272A7A234D32FC@sectorbase1.sectorbase.com
обсуждение исходный текст
Ответ на Proposed WAL changes  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Proposed WAL changes  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
> >> * Store two past checkpoint locations, not just one, in pg_control.
> >> On startup, we fall back to the older checkpoint if the newer one
> >> is unreadable.  Also, a physical copy of the newest 
> >> checkpoint record
> 
> > And what to do if older one is unreadable too?
> > (Isn't it like using 2 x CRC32 instead of CRC64 ? -:))
> 
> Then you lose --- but two checkpoints gives you twice the chance of
> recovery (probably more, actually, since it's much more likely that
> the previous checkpoint will have reached disk safely).

This is not correct. If log is corrupted somehow (checkpoint wasn't
flushed as promised) then you have no chance to *recover* because of
DB will be (most likely) in inconsistent state (data pages flushed
before corresponding log records etc). So, second checkpoint gives us
twice the chance to *restart* in normal way - read checkpoint and
rollforward from redo record, - not to *recover*. But this approach
twice increases on-line log size requirements and doesn't help to
handle cases when pg_control was corrupted. Note, I agreed that
disaster *restart* must be implemented, I just think that
"two checkpoints" approach is not the best way to follow.
From my POV, scanning logs is much better - it doesn't require
doubling size of on-line logs and allows to *restart* if pg_control
was lost/corrupted: 

If there is no pg_control or it's corrupted or points to
unexistent/corrupted checkpoint record then scan logs from
newest to oldest one till we find last valid checkpoint record
or oldest valid log record and than redo from there.

> See later discussion --- Andreas convinced me that flushing NEXTXID
> records to disk isn't really needed after all.  (I didn't 
> take the flush out of my patch yet, but will do so.) I still want
> to leave the NEXTXID records in there, though, because I think that
> XID and OID assignment ought to work as nearly alike as possible.

As I explained in short already: with UNDO we'll be able to reuse
XIDs after restart - ie there will be no point to have NEXTXID
records at all. And there is no point to add it now.
Does it fix anything? Isn't "fixing" all what we must do in beta?

Vadim


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Proposed WAL changes
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Proposed WAL changes