Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

Поиск

Список

Период

Сортировка

От	Craig Ringer
Тема	Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS
Дата	6 апреля 2018 г. 07:27:05
Msg-id	CAMsr+YEjnSyiMJ63yPDJvw8j95qrwD=pZ5u8e=QKiPX5CEm38w@mail.gmail.com обсуждение исходный текст
Ответ на	Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS (Andrew Gierth <andrew@tao11.riddles.org.uk>)
Ответы	Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS (Thomas Munro <thomas.munro@enterprisedb.com>)
Список	pgsql-hackers

Дерево обсуждения

On 6 April 2018 at 07:37, Andrew Gierth <andrew@tao11.riddles.org.uk> wrote:

Note: as I've brought up in another thread, it turns out that PG is not
handling fsync errors correctly even when the OS _does_ do the right
thing (discovered by testing on FreeBSD).

Yikes. For other readers, the related thread for this is https://www.postgresql.org/message-id/87y3i1ia4w.fsf@news-spur.riddles.org.uk

Meanwhile, I've extended my test to run postgres on a deliberately faulty volume and confirmed my results there.

_____________

2018-04-06 01:11:40.555 UTC [58] LOG: checkpoint starting: immediate force wait

2018-04-06 01:11:40.567 UTC [58] ERROR: could not fsync file "base/12992/16386": Input/output error

2018-04-06 01:11:40.655 UTC [66] ERROR: checkpoint request failed

2018-04-06 01:11:40.655 UTC [66] HINT: Consult recent messages in the server log for details.

2018-04-06 01:11:40.655 UTC [66] STATEMENT: CHECKPOINT

Checkpoint failed with checkpoint request failed

HINT: Consult recent messages in the server log for details.

Retrying

2018-04-06 01:11:41.568 UTC [58] LOG: checkpoint starting: immediate force wait

2018-04-06 01:11:41.614 UTC [58] LOG: checkpoint complete: wrote 0 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.001 s, sync=0.000 s, total=0.046 s; sync files=3, longest=0.000 s, average=0.000 s; distance=2727 kB, estimate=2779 kB

Ooops, it worked! We ignored the error and checkpointed OK.

_____________

Given your report, now I have to wonder if we even reissued the fsync() at all this time. 'perf' time. OK, with

sudo perf record -e syscalls:sys_enter_fsync,syscalls:sys_exit_fsync -a

sudo perf script

I see the failed fync, then the same fd being fsync()d without error on the next checkpoint, which succeeds.

postgres 9602 [003] 72380.325817: syscalls:sys_enter_fsync: fd: 0x00000005

postgres 9602 [003] 72380.325931: syscalls:sys_exit_fsync: 0xfffffffffffffffb

...

postgres 9602 [000] 72381.336767: syscalls:sys_enter_fsync: fd: 0x00000005

postgres 9602 [000] 72381.336840: syscalls:sys_exit_fsync: 0x0

... and Pg continues merrily on its way without realising it lost data:

[72379.834872] XFS (dm-0): writeback error on sector 118752

[72380.324707] XFS (dm-0): writeback error on sector 118688

In this test I set things up so the checkpointer would see the first fsync() error. But if I make checkpoints less frequent, the bgwriter aggressive, and kernel dirty writeback aggressive, it should be possible to have the failure go completely unobserved too. I'll try that next, because we've already largely concluded that the solution to the issue above is to PANIC on fsync() error. But if we don't see the error at all we're in trouble.

Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Thomas Munro
Дата: 06 апреля 2018 г., 07:11:36
Сообщение: Re: Checkpoint not retrying failed fsync?

Следующее

От: Peter Eisentraut
Дата: 06 апреля 2018 г., 07:29:06
Сообщение: Re: PATCH: psql tab completion for SELECT

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

Предыдущее

Следующее