WAL replay should fdatasync() segments?

Поиск
Список
Период
Сортировка
От Andres Freund
Тема WAL replay should fdatasync() segments?
Дата
Msg-id 20140122162115.GL21170@alap3.anarazel.de
обсуждение исходный текст
Ответы Re: WAL replay should fdatasync() segments?  (Fujii Masao <masao.fujii@gmail.com>)
Список pgsql-hackers
Hi,

Currently, XLogInsert(), XLogFlush() or XLogBackgroundFlush() will
write() data before fdatasync()ing them (duh, kinda obvious). But I
think given the current recovery code that leaves a window where we can
get into strange inconsistencies.
Consider what happens if postgres (not the OS!) crashes after writing
WAL data to the OS, but before fdatasync()ing it. Replay will happily
read that record from disk and replay it, which is fine. At the end of
recovery we then will start inserting new records, and those will be
properly fsynced to disk.
But if the *OS* crashes in that moment we might get into the strange
situation where older records might be lost since they weren't
fsync()ed, but newer records and the control file will persist.

I think for a primary that window is relatively small, but I think it's
a good bit bigger for a standby, especially if it's promoted.

I think the correct way to handle this would be to fsync() segments we
read from pg_xlog/ during recovery.

Am I missing something?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire
Следующее
От: Kevin Grittner
Дата:
Сообщение: Re: Incorrectly reporting config errors