Re: fsync reliability

Поиск
Список
Период
Сортировка
От Daniel Farina
Тема Re: fsync reliability
Дата
Msg-id BANLkTimahmfL+Hefeti_Do0Kv0CMh+dCiw@mail.gmail.com
обсуждение исходный текст
Ответ на fsync reliability  (Simon Riggs <simon@2ndQuadrant.com>)
Список pgsql-hackers
On Thu, Apr 21, 2011 at 1:26 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> Daniel Farina points out to me that the Linux man page for fsync() says
> "Calling fsync() does not necessarily ensure that the entry in the directory
>       containing the file has also reached disk.  For that an
> explicit fsync() on a
>       file descriptor for the directory is also needed."
> http://www.kernel.org/doc/man-pages/online/pages/man2/fsync.2.html

I'd also like to point out that even on ext(2|3) there is a special
option, 'dirsync', and directory attribute (see 'chattr') that exists,
mostly to the benefit of the authors of MTAs that use a lot of
metadata manipulation operations, to allow all directory metadata
mangling to be synchronous, to get around non-durable metadata
manipulations (even if you use fsync() a crash between the rename()
and the fsync() will leave you in either the pre-move or post-move
state: it is atomic, and non-durable, the synchronous directory
modification ensures that the return of rename() coincides with the
durability of the rename itself, or so I would think.

I only found this from doing some research about how perform a
two-phase commit between postgres and the file system and reading the
kernel source.  I admit, it's a dusty and obscure corner, but it still
seems in use by said MTAs.

Would a reading and exploration of the kernel code at hand perhaps
help resolve this discussion, one way or another?

--
fdr


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Eisentraut
Дата:
Сообщение: Re: Some TODO items for collations
Следующее
От: Greg Stark
Дата:
Сообщение: Re: Unlogged tables, persistent kind