Re: Postgres, fsync, and OSs (specifically linux)

Поиск
Список
Период
Сортировка
От Simon Riggs
Тема Re: Postgres, fsync, and OSs (specifically linux)
Дата
Msg-id CANP8+jJETivNC9++X6-pGbNnvx03ppZCLua+djdvOtZnsFsjiw@mail.gmail.com
обсуждение исходный текст
Ответ на Postgres, fsync, and OSs (specifically linux)  (Andres Freund <andres@anarazel.de>)
Ответы Re: Postgres, fsync, and OSs (specifically linux)  (Andres Freund <andres@anarazel.de>)
Re: Postgres, fsync, and OSs (specifically linux)  (Craig Ringer <craig@2ndquadrant.com>)
Re: Postgres, fsync, and OSs (specifically linux)  (Simon Riggs <simon@2ndquadrant.com>)
Список pgsql-hackers
On 27 April 2018 at 15:28, Andres Freund <andres@anarazel.de> wrote:

> - Add a pre-checkpoint hook that checks for filesystem errors *after*
>   fsyncing all the files, but *before* logging the checkpoint completion
>   record. Operating systems, filesystems, etc. all log the error format
>   differently, but for larger installations it'd not be too hard to
>   write code that checks their specific configuration.
>
>   While I'm a bit concerned adding user-code before a checkpoint, if
>   we'd do it as a shell command it seems pretty reasonable. And useful
>   even without concern for the fsync issue itself. Checking for IO
>   errors could e.g. also include checking for read errors - it'd not be
>   unreasonable to not want to complete a checkpoint if there'd been any
>   media errors.

It seems clear that we need to evaluate our compatibility not just
with an OS, as we do now, but with an OS/filesystem.

Although people have suggested some approaches, I'm more interested in
discovering how we can be certain we got it right.

And the end result seems to be that PostgreSQL will be forced, in the
short term, to declare certain combinations of OS/filesystem
unsupported, with clear warning sent out to users.

Adding a pre-checkpoint hook encourages people to fix this themselves
without reporting issues, so I initially oppose this until we have a
clearer argument as to why we need it. The answer is not to make this
issue more obscure, but to make it more public.

> - Use direct IO. Due to architectural performance issues in PG and the
>   fact that it'd not be applicable for all installations I don't think
>   this is a reasonable fix for the issue presented here. Although it's
>   independently something we should work on.  It might be worthwhile to
>   provide a configuration that allows to force DIO to be enabled for WAL
>   even if replication is turned on.

"Use DirectIO" is roughly same suggestion as "don't trust Linux filesystems".

It would be a major admission of defeat for us to take that as our
main route to a solution.

The people I've spoken to so far have encouraged us to continue
working with the filesystem layer, offering encouragement of our
decision to use filesystems.

-- 
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Stephen Frost
Дата:
Сообщение: Re: Postgres, fsync, and OSs (specifically linux)
Следующее
От: Peter Eisentraut
Дата:
Сообщение: Re: Verbosity of genbki.pl