Re: During promotion, new master tries to archive same segment twice

Поиск
Список
Период
Сортировка
От David Steele
Тема Re: During promotion, new master tries to archive same segment twice
Дата
Msg-id eb694531-84db-f08f-4a45-74469163a8e3@pgmasters.net
обсуждение исходный текст
Ответ на Re: During promotion, new master tries to archive same segmenttwice  ("Phil Endecott" <spam_from_pgsql_lists@chezphil.org>)
Список pgsql-general
On 8/16/18 4:37 AM, Phil Endecott wrote:
> David Steele wrote:
>> On 8/15/18 4:25 PM, Phil Endecott wrote:
>>> - Should my archive_command detect the case where it is asked to
>>> write the same file again with the same contents, and report success
>>> in that case?
>>
>> Yes.
> 
>> There are a number of cases where the same WAL
>> segment can be pushed more than once, especially after failures where
>> Postgres is not sure that the command completed.  The archive command
>> should handle this gracefully.
> 
> Hmm, OK.  Here's what the current docs say:
> 
> Section 25.3.1:
> 
> "The archive command should generally be designed to refuse to
> overwrite any pre-existing archive file. This is an important
> safety feature to preserve the integrity of your archive in case
> of administrator error (such as sending the output of two
> different servers to the same archive directory).
> 
> It is advisable to test your proposed archive command to ensure
> that it indeed does not overwrite an existing file, and that it
> returns nonzero status in this case."
> 
> And section 26.2.9:
> 
> "When continuous WAL archiving is used in a standby, there
> are two different scenarios: the WAL archive can be shared
> between the primary and the standby, or the standby can
> have its own WAL archive.  When the standby has its own WAL
> archive, set archive_mode to always, and the standby will call
> the archive command for every WAL segment it receives, whether
> it's by restoring from the archive or by streaming replication.
> The shared archive can be handled similarly, but the
> archive_command must test if the file being archived exists
> already, and if the existing file has identical contents.
> This requires more care in the archive_command, as it must be
> careful to not overwrite an existing file with different contents,
> but return success if the exactly same file is archived twice.
> And all that must be done free of race conditions, if two
> servers attempt to archive the same file at the same time."
> 
> So you're saying that that's wrong, and that I must always
> handle the case when the same WAL segment is written twice.

Seems like an omission in section 25.3.1 rather than a problem in 26.2.9.

Duplicate WAL is possible in *all* cases.  A trivial example is that
Postgres calls archive_command and it succeeds but an error happens
(e.g. network) right before Postgres is notified.  It will wait a bit
and try the same WAL segment again.

> I'll file a bug against the documentation.

OK.

>> pgBackRest has done this for years and it saves a *lot* of headaches.
> 
> The system to which I am sending the WAL files is a rsync.net
> account.  I use it because of its reliability, but methods for
> transferring files are limited largely to things like scp and
> rsync.

Rsync and scp are not good tools to use for backup because there is no
guarantee of durability, i.e. the file is not synced to disk before
success is returned.  rsync.net may have durability guarantees but you
should verify that with them.

Even so, crafting a safe archive_command using these tools is going to
be very tricky.

Regards,
-- 
-David
david@pgmasters.net


В списке pgsql-general по дате отправления:

Предыдущее
От: Andreas Joseph Krogh
Дата:
Сообщение: Logical replication from standby
Следующее
От: Stephen Frost
Дата:
Сообщение: Re: During promotion, new master tries to archive same segment twice