Re: Do you see any problems with this procedure for Old Masterrebuild as a Slave upon switchover ?

Поиск
Список
Период
Сортировка
От Kyotaro HORIGUCHI
Тема Re: Do you see any problems with this procedure for Old Masterrebuild as a Slave upon switchover ?
Дата
Msg-id 20190507.124325.10578704.horiguchi.kyotaro@lab.ntt.co.jp
обсуждение исходный текст
Ответ на Do you see any problems with this procedure for Old Master rebuild asa Slave upon switchover ?  (Avinash Kumar <avinash.vallarapu@gmail.com>)
Список pgsql-hackers
Hello.

At Mon, 29 Apr 2019 00:28:31 -0300, Avinash Kumar <avinash.vallarapu@gmail.com> wrote in
<CAN0Tujf0JJnC8RqbqBLX1xez9SGDSZhi6oc3JhB06WcDk1TTAQ@mail.gmail.com>
> Hi Team,
> 
> Let us say we have a Master (M1) and a Slave (S1) in replication using
> Streaming Replication.
> 
> I stopped all my writes from Application and i switched a WAL and made sure
> it is replicated to Slave.
> I have then shutdown M1. And ran a promote on S1.
> Now S1 is my new Master with a new timeline.
> 
> Now, in order to let M1 replicate changes from S1 (Master) as a Slave, i am
> able to succeed with the following approach.
> 
> Add recovery_target_timeline = 'latest' and then have the appropriate
> entries such as primary_conninfo, standby_mode in the recovery.conf and
> start the M1 using pg_ctl.
> 
> I see that it M1 (Old Master) is able to catch up with S1 (New Master). And
> replication is going fine.
> Have you ever faced or think of a problem with this approach ?
> 
> Points to note are :
> 1. Master was neatly SHUTDOWN after shutting down writes. So, it has not
> diverged. (If it is diverged, i would of course need a pg_rewind like
> approach).
> 2. It was a planned switchover. During this entire process, there are no
> writes to M1 (before Switchover) or S1 (after promote).

No normal backends remain at the time of the final
checkpoint. And walsender terminates after the final checkpoint
and archiving are done. So that is assured by design if no
trouble happens elsewhere.

> 3. timeline history file is also accessible to the Old Master (M1) after S1
> was promoted. No transactions, so no WALs generated, may be 1 or 2
> considering timeout, etc.

Note that no transactons doesn't mean no WALs. There're WAL
records that have roots in other than transatcion activities like
RUNNING_XACTS. (This doesn't deny the discussion above.)

> It looks like a clean approach, but do you think there could be a problem
> with this approach of rebuilding Old Master as a Slave ? Is this approach
> still okay ?

It's just my personal view, but I don't fully trust on the
assumption. pg_rewind does nothing if the two servers didn't
diverge. So I think there's no reason to hesitate to run
pg_rewind to make sure the new standby can be used safely as-is
in the case. (Note that pg_rewind misdiagnoses that the new
master is on the same timeline with the old master before the
first checkpoint finishes after standby's promotion.)

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: reindexdb & clusterdb broken against pre-7.3 servers
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: Naming of pg_checksums