Re: pg_rewind succeed but postgresql showing error when trying to make standby with common ancestor

Поиск
Список
Период
Сортировка
От niraj nandane
Тема Re: pg_rewind succeed but postgresql showing error when trying to make standby with common ancestor
Дата
Msg-id CALpWO+ARv0pzBD-RbMtPUO-XMYDN3MKnkmQZ3eCOXkeT4K4PLA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: pg_rewind succeed but postgresql showing error when trying to make standby with common ancestor  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Список pgsql-bugs
<<
In this case, if you want to make node3 a standby of node2, just
disconnect (shutdown) node3 first, then promote node2, at last start
node3 as a standby of node2. No need for pg_rewind since node3 is
surely in the past of node2. (But a large replication delay might
badly affect.)  pg_rewind doesn't work also in this case since node3
is likely be at before node2's promotion LSN.
>>
Both the case with and without pg_rewind working. Here is output of pg_rewind:

pg_rewind: connected to server
pg_rewind: servers diverged at WAL location 0/316DD838 on timeline 1
pg_rewind: rewinding from last common checkpoint at 0/2E000060 on timeline 1
pg_rewind: reading source file list
pg_rewind: reading target file list
pg_rewind: reading WAL in target
pg_rewind: need to copy 98 MB (total source directory size is 219 MB)
     0/100453 kB (0%) copied
100453/100453 kB (100%) copied
pg_rewind: creating backup label and updating control file
pg_rewind: syncing target data directory
pg_rewind: Done!

<<
Yeah. But I think pg_rewind is not needed at all in this case.
>>
When i promote node2 and node3 having common ancestor, and when joining node3 to the node2, pg_rewind should return code non-zero.
What pg_rewind does it, gives below and set return code as 0.
pg_rewind: connected to server
pg_rewind: source and target cluster are on the same timeline
pg_rewind: no rewind required

Can this be done? Can pg_rewind return code as non-zero.

Comment from Laurenz Albe at: https://dba.stackexchange.com/questions/320997/pg-rewind-succeed-but-postgresql-showing-error-when-trying-to-make-standby-with?noredirect=1#comment626006_320997

I would say that it is a bug that pg_rewind returns 0. It should not only check the time line number, but also compare the time the time line branched off. You may want to complain to the pgsql-hackers or pgsql-bugs mailing lists about that

Thanks a lot.

On Fri, Dec 16, 2022 at 7:41 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
Hello.

At Thu, 15 Dec 2022 15:03:28 +0530, niraj nandane <niraj.nandane@gmail.com> wrote in
> Have 3 node setup: node1(50.2), node2(50.3) and node3(50.4). node1(50.2) is
> primary and node2(50.3) and node3(50.4) are standby. I promoted node3 first
> and then node2 to make them independent. When i try to make the node3 slave
> of node2, using pg_rewind, it says below:

pg_rewind is intended to be used for the case where new primary has
been promoted in the past of new standby (called "diverging") that has
not promoted. Thus this is not the case.

In this case, if you want to make node3 a standby of node2, just
disconnect (shutdown) node3 first, then promote node2, at last start
node3 as a standby of node2. No need for pg_rewind since node3 is
surely in the past of node2. (But a large replication delay might
badly affect.)  pg_rewind doesn't work also in this case since node3
is likely be at before node2's promotion LSN.

> After this, when i start postgresql on the node3 in standby mode, i get
> below:

...(edited)
> Dec 15 07:57:33 LOG:  started streaming WAL from primary at 0/2F000000 on timeline 2
> Dec 15 07:57:33 FATAL:  could not receive data from WAL stream: ERROR:  requested WAL segment 00000002000000000000002F has already been removed
(repeats)

I'm not sure how long the interval between the promotions of node2 and
node3, but since node2 is likely to be advanced than node3, it's not
surprising if node2 has been removed a segment that node3 is at.

> There is no WAL archiving set. This is PostgreSQL 14.4. Does this means
> PostgreSQL pg_rewind can't be used when forming cluster between two
> independent primary having common ancestor?

Yes, as explained above.

> Note: If i just promote node2 and run pg_rewind on node3 to join with
> node2, then it works correctly.

Yeah. But I think pg_rewind is not needed at all in this case.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center


--

Thanks,
Niraj Nandane,
Veritas LLC, Pune

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Peter Geoghegan
Дата:
Сообщение: Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
Следующее
От: Peter Eisentraut
Дата:
Сообщение: Re: BUG #17720: pg_dump creates a dump with primary key that cannot be restored, when specifying 'using index ...'