Re: pg_basebackup, pg_receivexlog and data durability (was: silent data loss with ext4 / all current versions)

Поиск
Список
Период
Сортировка
От Magnus Hagander
Тема Re: pg_basebackup, pg_receivexlog and data durability (was: silent data loss with ext4 / all current versions)
Дата
Msg-id CABUevEzK3jzWyOX-2GwnVC-efPJm6_2_UZ=dxEzUs-rujw=1bg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: pg_basebackup, pg_receivexlog and data durability (was: silent data loss with ext4 / all current versions)  (Michael Paquier <michael.paquier@gmail.com>)
Ответы Re: pg_basebackup, pg_receivexlog and data durability (was: silent data loss with ext4 / all current versions)  (Michael Paquier <michael.paquier@gmail.com>)
Список pgsql-hackers

On Fri, Sep 2, 2016 at 8:50 AM, Michael Paquier <michael.paquier@gmail.com> wrote:
On Fri, Sep 2, 2016 at 2:20 AM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> On 5/13/16 2:39 AM, Michael Paquier wrote:
>> So, attached are two patches that apply on HEAD to address the problem
>> of pg_basebackup that does not sync the data it writes. As
>> pg_basebackup cannot use directly initdb -S because, as a client-side
>> utility, it may be installed while initdb is not (see Fedora and
>> RHEL), I have refactored the code so as the routines in initdb.c doing
>> the fsync of PGDATA and other fsync stuff are in src/fe_utils/, and
>> this is 0001.
>
> Why fe_utils?  initdb is not a front-end program.

Thinking about that, you are right. Let's move it to src/common,
frontend-only though.

>> Patch 0002 is a set of fixes for pg_basebackup:
>> - In plain mode, fsync_pgdata is used so as all the tablespaces are
>> fsync'd at once. This takes care as well of the case where pg_xlog is
>> a symlink.
>> - In tar mode (no stdout), each tar file is synced individually, and
>> the base directory is synced once at the end.
>> In both cases, failures are not considered fatal.
>
> Maybe there should be --nosync options like initdb has?

What do others think about that? I could implement that on top of 0002
with some extra options. But to be honest that looks to be just some
extra sugar for what is basically a bug fix... And I am feeling that
providing such a switch to users would be a way for one to shoot
himself badly, particularly for pg_receivexlog where a crash can cause
segments to go missing.


Well, why do we provide a --nosync option for initdb? Wouldn't the argument basically be the same? 

I agree it kind of feels like overkill, but it would be consistent overkill? :)

--

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)
Следующее
От: Aleksander Alekseev
Дата:
Сообщение: Re: Re: PROPOSAL: make PostgreSQL sanitizers-friendly (and prevent information disclosure)