Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

Поиск
Список
Период
Сортировка
От Craig Ringer
Тема Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS
Дата
Msg-id CAMsr+YH8JP-UdsGt0dLMcDRx6WQ78BZA7kMgimu8+ZuB_uzyFQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS  (Thomas Munro <thomas.munro@enterprisedb.com>)
Список pgsql-hackers
On 29 March 2018 at 20:07, Thomas Munro <thomas.munro@enterprisedb.com> wrote:
On Thu, Mar 29, 2018 at 6:58 PM, Craig Ringer <craig@2ndquadrant.com> wrote:
> On 28 March 2018 at 11:53, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>
>> Craig Ringer <craig@2ndquadrant.com> writes:
>> > TL;DR: Pg should PANIC on fsync() EIO return.
>>
>> Surely you jest.
>
> No. I'm quite serious. Worse, we quite possibly have to do it for ENOSPC as
> well to avoid similar lost-page-write issues.

I found your discussion with kernel hacker Jeff Layton at
https://lwn.net/Articles/718734/ in which he said: "The stackoverflow
writeup seems to want a scheme where pages stay dirty after a
writeback failure so that we can try to fsync them again. Note that
that has never been the case in Linux after hard writeback failures,
AFAIK, so programs should definitely not assume that behavior."

The article above that says the same thing a couple of different ways,
ie that writeback failure leaves you with pages that are neither
written to disk successfully nor marked dirty.

If I'm reading various articles correctly, the situation was even
worse before his errseq_t stuff landed.  That fixed cases of
completely unreported writeback failures due to sharing of PG_error
for both writeback and read errors with certain filesystems, but it
doesn't address the clean pages problem.

Yeah, I see why you want to PANIC.

In more ways than one ;)

> I'm not seeking to defend what the kernel seems to be doing. Rather, saying
> that we might see similar behaviour on other platforms, crazy or not. I
> haven't looked past linux yet, though.

I see no reason to think that any other operating system would behave
that way without strong evidence...  This is openly acknowledged to be
"a mess" and "a surprise" in the Filesystem Summit article.  I am not
really qualified to comment, but from a cursory glance at FreeBSD's
vfs_bio.c I think it's doing what you'd hope for... see the code near
the comment "Failed write, redirty."

Ok, that's reassuring, but doesn't help us on the platform the great majority of users deploy on :(

"If on Linux, PANIC"

Hrm.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alexander Korotkov
Дата:
Сообщение: Re: [HACKERS] GSoC 2017: weekly progress reports (week 6)
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Parallel safety of binary_upgrade_create_empty_extension