Re: [HACKERS] WIP: Restricting pg_rewind to data/wal dirs

Поиск
Список
Период
Сортировка
От Chris Travers
Тема Re: [HACKERS] WIP: Restricting pg_rewind to data/wal dirs
Дата
Msg-id CAN-RpxB1jJ5pF5G3cbTe1BmXZx_2JOZLMCHW9zEo+t2XDSRhJQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] WIP: Restricting pg_rewind to data/wal dirs  (Michael Paquier <michael.paquier@gmail.com>)
Список pgsql-hackers


On Mon, Oct 30, 2017 at 11:36 AM, Michael Paquier <michael.paquier@gmail.com> wrote:
On Mon, Oct 30, 2017 at 10:15 AM, Chris Travers
<chris.travers@adjust.com> wrote:
> This also brings up a fairly major concern more generally about control by
> the way.  A lot of cases where pg_rewind is called, the user doesn't
> necessarily have much control on how it is called.  Moreover in many of
> these cases, the user is probably not in a position to understand the
> internals well enough to grasp what to check after.

Likely they are not.

So currently I am submitting the current patch to commit fest.  I am open to adding a new 
--Include-path option in this patch, but I am worried about atomicity concerns, and I am still
not really sure what the impact is for you (I haven't heard whether you expect this file to be
there before the rewind, i.e. whether it would be on both master and slave, or just on the slave). 

Sp there is the question of under what specific circumstances this would break for you.

> Right, but there is a use case difference between "I am taking a backup of a
> server" and "I need to get the server into  rejoin the replication as a
> standby."

The intersection of the first and second categories is not empty. Some
take backups and use them to deploy standbys.

Sure, but it is not complete either. 

> A really good example of where this is a big problem is with replication
> slots.  On a backup I would expect you want replication slots to be copied
> in.

I would actually expect the contrary, and please note that replication
slots are not taken in a base backup, which is what the documentation
says as well:
https://www.postgresql.org/docs/10/static/protocol-replication.html
"pg_dynshmem, pg_notify, pg_replslot, pg_serial, pg_snapshots,
pg_stat_tmp, and pg_subtrans are copied as empty directories (even if
they are symbolic links)."

Many of these are emptied on server restart but repslot is not.  What is the logic
for excluding it from backups other than to avoid problems with replication provisioning?

I mean if I have a replication slot for taking backups with barman (and I am not actually doing replication) why would I *not* want that in my base backup if I might have to restore to a new machine somewhere?

Some code I have with 9.6's pg_bsaebackup removes manually replication
slots as this logic is new in 10 ;)

>> The pattern that base backups now use is an exclude list. In the
>> future I would rather see base backups and pg_rewind using the same
>> infrastructure for both things:
>> - pg_rewind should use the replication protocol with BASE_BACKUP.
>> Having it rely on root access now is crazy.
>> - BASE_BACKUP should have an option where it is possible to exclude
>> custom paths.
>> What you are proposing here would make both diverge more, which is in
>> my opinion not a good approach.
>
> How does rep mgr or other programs using pg_rewind know what to exclude?

Good question. Answers could come from folks such as David Steele
(pgBackRest) or Marco (barman) whom I am attaching in CC.

Two points that further occur to me:

Shell globs seem to me to be foot guns in this area, I think special paths should be one path per invocation of the option not "--exclude=pg_w*" since this is just asking for problems as time goes on and things get renamed.

It also seems to me that adding specific paths is far safer than removing specific paths. 
--
Michael



--
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com 
Saarbrücker Straße 37a, 10405 Berlin

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Oleg Bartunov
Дата:
Сообщение: Re: [HACKERS] How to implement a SP-GiST index as a extension module?
Следующее
От: Andres Freund
Дата:
Сообщение: Re: [HACKERS] Current int & float overflow checking is slow.