Checkpoint not retrying failed fsync?

Поиск
Список
Период
Сортировка
От Andrew Gierth
Тема Checkpoint not retrying failed fsync?
Дата
Msg-id 87y3i1ia4w.fsf@news-spur.riddles.org.uk
обсуждение исходный текст
Ответы Re: Checkpoint not retrying failed fsync?  (Thomas Munro <thomas.munro@enterprisedb.com>)
Список pgsql-hackers
This is only a preliminary report, I'm still trying to analyze what's
going on, but:

In doing testing on FreeBSD with a filesystem set up to induce errors
controllably (using gconcat+gnop), I can get this to happen (on 11devel):

(note that "mytable" is on a tablespace on the erroring filesystem,
while "x" is on a clean filesystem)

postgres=# insert into mytable values (-1);
INSERT 0 1
postgres=# checkpoint;
ERROR:  checkpoint request failed
HINT:  Consult recent messages in the server log for details.
postgres=# insert into x values (3);
INSERT 0 1
postgres=# checkpoint;
CHECKPOINT

(the message in the server log is the expected one about fsync failing)

Checking the WAL shows that there is indeed a checkpoint record for the
second checkpoint and pg_control points to it, so a crash restart at
this point would not try and replay the "mytable" write.

Furthermore, checking the trace output from the checkpointer process, it
is not even attempting an fsync of the failing file; this isn't like the
Linux fsync issue, I've confirmed that fsync will repeatedly fail on the
file until the underlying errors stop.

As far as I can tell from reading the code, if a checkpoint fails the
checkpointer is supposed to keep all the outstanding fsync requests for
next time. Am I wrong, or is there some failure in the logic to do this?

-- 
Andrew (irc:RhodiumToad)


В списке pgsql-hackers по дате отправления:

Предыдущее
От: David Rowley
Дата:
Сообщение: Re: Parallel Aggregates for string_agg and array_agg
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: [HACKERS] path toward faster partition pruning