Re: WAL replay should fdatasync() segments?

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: WAL replay should fdatasync() segments?
Дата
Msg-id 20140122170828.GB30218@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: WAL replay should fdatasync() segments?  (Fujii Masao <masao.fujii@gmail.com>)
Ответы Re: WAL replay should fdatasync() segments?  (Fujii Masao <masao.fujii@gmail.com>)
Список pgsql-hackers
On 2014-01-23 02:05:48 +0900, Fujii Masao wrote:
> On Thu, Jan 23, 2014 at 1:21 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> > Hi,
> >
> > Currently, XLogInsert(), XLogFlush() or XLogBackgroundFlush() will
> > write() data before fdatasync()ing them (duh, kinda obvious). But I
> > think given the current recovery code that leaves a window where we can
> > get into strange inconsistencies.
> > Consider what happens if postgres (not the OS!) crashes after writing
> > WAL data to the OS, but before fdatasync()ing it. Replay will happily
> > read that record from disk and replay it, which is fine. At the end of
> > recovery we then will start inserting new records, and those will be
> > properly fsynced to disk.
> > But if the *OS* crashes in that moment we might get into the strange
> > situation where older records might be lost since they weren't
> > fsync()ed, but newer records and the control file will persist.
> >
> > I think for a primary that window is relatively small, but I think it's
> > a good bit bigger for a standby, especially if it's promoted.
> 
> In normal streaming replication case, ISTM that window is not bigger for
> the standby because basically the standby replays only the WAL data
> which walreceiver fsync'd to the disk. But if it replays the WAL file which
> was fetched from the archive, that WAL file might not have been flushed
> to the disk yet. In this case, that window might become bigger...

Yea, but if the walreceiver receives data and crashes/disconnects before
fsync(), we'll read it from pg_xlog, rigth? And if we promote, we'll
start inserting new records before establishing a new checkpoint.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Fujii Masao
Дата:
Сообщение: Re: WAL replay should fdatasync() segments?
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Incorrectly reporting config errors