Re: Re: Slave enters in recovery and promotes when WAL stream with master is cut + delay master/slave

Поиск
Список
Период
Сортировка
От Michael Paquier
Тема Re: Re: Slave enters in recovery and promotes when WAL stream with master is cut + delay master/slave
Дата
Msg-id CAB7nPqQF_F5eJ7iQM9BW-Au6061CH5osL0J4HmsHFazCwgoKQA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Re: Slave enters in recovery and promotes when WAL stream with master is cut + delay master/slave  (Fujii Masao <masao.fujii@gmail.com>)
Ответы Re: Slave enters in recovery and promotes when WAL stream with master is cut + delay master/slave  (Andres Freund <andres@2ndquadrant.com>)
Re: Slave enters in recovery and promotes when WAL stream with master is cut + delay master/slave  (Andres Freund <andres@2ndquadrant.com>)
Список pgsql-hackers
On Fri, Jan 18, 2013 at 3:05 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
I encountered the problem that the timeline switch is not performed expectedly.
I set up one master, one standby and one cascade standby. All the servers
share the archive directory. restore_command is specified in the recovery.conf
in those two standbys.

I shut down the master, and then promoted the standby. In this case, the
cascade standby should switch to new timeline and replication should be
successfully restarted. But the timeline was never changed, and the following
log messages were kept outputting.

sby2 LOG:  restarted WAL streaming at 0/3000000 on timeline 1
sby2 LOG:  replication terminated by primary server
sby2 DETAIL:  End of WAL reached on timeline 1
sby2 LOG:  restarted WAL streaming at 0/3000000 on timeline 1
sby2 LOG:  replication terminated by primary server
sby2 DETAIL:  End of WAL reached on timeline 1
sby2 LOG:  restarted WAL streaming at 0/3000000 on timeline 1
sby2 LOG:  replication terminated by primary server
sby2 DETAIL:  End of WAL reached on timeline 1
I am seeing similar issues with master at 88228e6.
This is easily reproducible by setting up 2 slaves under a master, then kill the master. Promote slave 1 and  reconnect slave 2 to slave 1, then you will notice that the timeline jump is not done.

I don't know if Masao tried to put in sync the slave that reconnects to the promoted slave, but in this case slave2 stucks in "potential" state". That is due to timeline that has not changed on slave2 but better to let you know...

The replication delays are still here.
--
Michael Paquier
http://michael.otacoo.com

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: HS locking broken in HEAD
Следующее
От: Stephen Frost
Дата:
Сообщение: Re: could not create directory "...": File exists