Re: Replication failure, slave requesting old segments

Поиск
Список
Период
Сортировка
От Phil Endecott
Тема Re: Replication failure, slave requesting old segments
Дата
Msg-id 1534101938762@dmwebmail.dmwebmail.chezphil.org
обсуждение исходный текст
Ответ на Re: Replication failure, slave requesting old segments  (Adrian Klaver <adrian.klaver@aklaver.com>)
Ответы Re: Replication failure, slave requesting old segments  (Adrian Klaver <adrian.klaver@aklaver.com>)
Список pgsql-general
Hi Adrian,

Adrian Klaver wrote:
> On 08/11/2018 12:42 PM, Phil Endecott wrote:
>> Hi Adrian,
>> 
>> Adrian Klaver wrote:
>>> Looks like the master recycled the WAL's while the slave could not 
>>> connect.
>> 
>> Yes but... why is that a problem?  The master is copying the WALs to
>> the backup server using scp, where they remain forever.  The slave gets
>
> To me it looks like that did not happen:
>
> 2018-08-11 00:05:50.364 UTC [615] LOG:  restored log file 
> "0000000100000007000000D0" from archive
> scp: backup/postgresql/archivedir/0000000100000007000000D1: No such file 
> or directory
> 2018-08-11 00:05:51.325 UTC [7208] LOG:  started streaming WAL from 
> primary at 7/D0000000 on timeline 1
> 2018-08-11 00:05:51.325 UTC [7208] FATAL:  could not receive data from 
> WAL stream: ERROR:  requested WAL segment 0000000100000007000000D0 has 
> already been removed
>
> Above 0000000100000007000000D0 is gone/recycled on the master and the 
> archived version does not seem to be complete as the streaming 
> replication is trying to find it.

The files on the backup server were all 16 MB.


> Below you kick the master and it coughs up the files to the archive 
> including *D0 and *D1 on up to *D4 and then the streaming picks using *D5.

When I kicked it, the master wrote D1 to D4 to the backup.  It did not
change D0 (its modification time on the backup is from before the "kick").
The slave re-read D0, again, as it had been doing throughout this period,
and then read D1 to D4.


> Best guess is the archiving did not work as expected during:
>
> "(During this time the master was also down for a shorter period.)"

Around the time the master was down, the WAL segment names were CB and CC.
Files CD to CF were written between the master coming up and the slave
coming up.  The slave had no trouble restoring those segments when it started.
The problematic segments D0 and D1 were the ones that were "current" 
when the
slave restarted, at which time the master was up consistently.


Regards, Phil.






В списке pgsql-general по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: PostgreSQL C Language Extension with C++ Code
Следующее
От: "Phil Endecott"
Дата:
Сообщение: Re: Replication failure, slave requesting old segments