Re: Streaming replication and WAL archive interactions

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: Streaming replication and WAL archive interactions
Дата
Msg-id 549083D6.1000301@vmware.com
обсуждение исходный текст
Ответ на Re: Streaming replication and WAL archive interactions  (Borodin Vladimir <root@simply.name>)
Ответы Re: Streaming replication and WAL archive interactions  (Fujii Masao <masao.fujii@gmail.com>)
Список pgsql-hackers
On 12/16/2014 10:24 AM, Borodin Vladimir wrote:
> 12 дек. 2014 г., в 16:46, Heikki Linnakangas
> <hlinnakangas@vmware.com> написал(а):
>
>> There have been a few threads on the behavior of WAL archiving,
>> after a standby server is promoted [1] [2]. In short, it doesn't
>> work as you might expect. The standby will start archiving after
>> it's promoted, but it will not archive files that were replicated
>> from the old master via streaming replication. If those files were
>> not already archived in the master before the promotion, they are
>> not archived at all. That's not good if you wanted to restore from
>> a base backup + the WAL archive later.
>>
>> The basic setup is a master server, a standby, a WAL archive that's
>> shared by both, and streaming replication between the master and
>> standby. This should be a very common setup in the field, so how
>> are people doing it in practice? Just live with the wisk that you
>> might miss some files in the archive if you promote? Don't even
>> realize there's a problem? Something else?
>
> Yes, I do live like that (with streaming replication and shared
> archive between master and replicas) and don’t even realize there’s a
> problem :( And I think I’m not the only one. Maybe at least a note
> should be added to the documentation?

Let's try to figure out a way to fix this in master, but yeah, a note in 
the documentation is in order.

>> And how would we like it to work?

Here's a plan:

Have a mechanism in the standby, to track how far the master has 
archived its WAL, and don't throw away WAL in the standby that hasn't 
been archived in the master yet. This is similar to the physical 
replication slots, which prevent the master from recycling WAL that a 
standby hasn't received yet, but in reverse. I think we can use the 
.done and .ready files for this. Whenever a file is streamed 
(completely) from the master, create a .ready file for it. When we get 
an acknowledgement from the master that it has archived it, create a 
.done file for it. To get the information from the master, add the "last 
archived WAL segment" e.g. in the streaming replication keep-alive 
message, or invent a new message type for it.

At promotion, archive all the WAL from the old timeline that the master 
hadn't already archived. While doing this, the archive_command can be 
called for files that have in fact already been archived in the master, 
so the command needs to return success if it's asked to archive a file 
and an identical file already exists in the archive. That's a bit 
difficult to write into a one-liner, but hopefully we can still provide 
an example of this. Or have another command, e.g. 
"promotion_archive_command", which can just assume that everything is OK 
if the file already exists.

To enable this new mode, let's add a third option to archive_mode, 
besides on/off. Or just make this the default; I'm not sure if anyone 
would want the old behavior.

>> There was some discussion in August on enabling WAL archiving in
>> the standby, always [3]. That's a related idea, but it assumes that
>> you have a separate archive in the master and the standby. The
>> problem at promotion happens when you have a shared archive between
>> the master and standby.
>
> AFAIK most people use the scheme with shared archive.

Yeah. Anyway, we can support both scenarios.

- Heikki




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jeff Janes
Дата:
Сообщение: Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}
Следующее
От: Alex Shulgin
Дата:
Сообщение: Re: REVIEW: Track TRUNCATE via pgstat