Re: [bug fix] Cascading standby cannot catch up and get stuck emitting the same message repeatedly

Поиск

Список

Период

Сортировка

От	Amit Kapila
Тема	Re: [bug fix] Cascading standby cannot catch up and get stuck emitting the same message repeatedly
Дата	15 ноября 2016 г. 14:23:24
Msg-id	CAA4eK1LDV=MmLpvPGVTKixEnwgUoKrscA4cL6F2Aei6R4y0mXA@mail.gmail.com обсуждение исходный текст
Ответ на	Re: [bug fix] Cascading standby cannot catch up and get stuck emitting the same message repeatedly ("Tsunakawa, Takayuki" <tsunakawa.takay@jp.fujitsu.com>)
Ответы	Re: [bug fix] Cascading standby cannot catch up and get stuck emitting the same message repeatedly
Список	pgsql-hackers

Дерево обсуждения

On Tue, Nov 15, 2016 at 7:51 AM, Tsunakawa, Takayuki
<tsunakawa.takay@jp.fujitsu.com> wrote:
> From: pgsql-hackers-owner@postgresql.org
>> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Amit Kapila
>> It looks like the code in 9.3 or later version uses the recptr as the target
>> segment location
>> (targetSegmentPtr) whereas 9.2 uses recptr as beginning of segment (readOff
>> = 0;).  If above understanding is right then it will set different values
>> for latestPagePtr in 9.2 and 9.3 onwards code.
>>
>
> In 9.2, the relevant variable is not recptr but recaddr.  recaddr in 9.2 and recptr in later releases point to the
beginningof a page just read, which is not always the beginning of the segment (targetSegmentPtr).

>

I think it beginning of segment (aka the first page of the segment),
even the comment indicates the same.

/*
* Whenever switching to a new WAL segment, we read the first page of
* the file and validate its header, even if that's not where the
* target record is. ...
..
*/

However, on again looking at the code, it seems that part of code
behaves similarly for both 9.2 and 9.3.

..Because node3 found a WAL
!  * record fragment at the end of segment 10, it expects to find the
!  * remaining fragment at the beginning of WAL segment 11 streamed from
!  * node2. But there was a fragment of a different WAL record, because
!  * node2 overwrote a different WAL record at the end of segment 10 across
!  * to 11.

How does node3 ensure that the fragment of WAL in segment 11 is
different?  Isn't it possible that when node2 overwrites the last
record in WAL segment 10, it writes a record of slightly different
contents but which is of the same size as an original record in WAL
segment 10?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: [bug fix] Cascading standby cannot catch up and get stuck emitting the same message repeatedly