Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

Поиск
Список
Период
Сортировка
От Joshua D. Drake
Тема Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS
Дата
Msg-id 5708c218-4c0a-0690-bfbb-30c2df7845b1@commandprompt.com
обсуждение исходный текст
Ответ на Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS  (Mark Dilger <hornschnorter@gmail.com>)
Список pgsql-hackers
On 04/09/2018 09:45 AM, Robert Haas wrote:
> On Mon, Apr 9, 2018 at 8:16 AM, Craig Ringer <craig@2ndquadrant.com> wrote:
>> In the mean time, I propose that we fsync() on close() before we age FDs out
>> of the LRU on backends. Yes, that will hurt throughput and cause stalls, but
>> we don't seem to have many better options. At least it'll only flush what we
>> actually wrote to the OS buffers not what we may have in shared_buffers. If
>> the bgwriter does the same thing, we should be 100% safe from this problem
>> on 4.13+, and it'd be trivial to make it a GUC much like the fsync or
>> full_page_writes options that people can turn off if they know the risks /
>> know their storage is safe / don't care.
> I have a really tough time believing this is the right way to solve
> the problem.  We suffered for years because of ext3's desire to flush
> the entire page cache whenever any single file was fsync()'d, which
> was terrible.  Eventually ext4 became the norm, and the problem went
> away.  Now we're going to deliberately insert logic to do a very
> similar kind of terrible thing because the kernel developers have
> decided that fsync() doesn't have to do what it says on the tin?  I
> grant that there doesn't seem to be a better option, but I bet we're
> going to have a lot of really unhappy users if we do this.

I don't have a better option but whatever we do, it should be an optional
(GUC) change. We have plenty of YEARS of people not noticing this issue and
Robert's correct, if we go back to an era of things like stalls it is going
to look bad on us no matter how we describe the problem.

Thanks,

JD


-- 
Command Prompt, Inc. || http://the.postgres.company/ || @cmdpromptinc
***  A fault and talent of mine is to tell it exactly how it is.  ***
PostgreSQL centered full stack support, consulting and development.
Advocate: @amplifypostgres || Learn: https://postgresconf.org
*****     Unless otherwise stated, opinions are my own.   *****



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Heikki Linnakangas
Дата:
Сообщение: Re: [HACKERS] GSoC 2017: weekly progress reports (week 6)
Следующее
От: John Naylor
Дата:
Сообщение: Re: Documentation for bootstrap data conversion