Обсуждение: some questions regarding replication issues and timeline/history files

Поиск
Список
Период
Сортировка

some questions regarding replication issues and timeline/history files

От
Marcin Giedz
Дата:
Hello,

I've created synchronous replication between Primary and Secondary server and put pacemaker + PAF over. Client is doing some stress tests (switching nodes, disabling particular node, etc) and it's second time when this kind of problem occoures :

2020-12-18 14:03:46.658 CET [unknown] [28787]ERROR:  requested starting point F/A2000000 on timeline 39 is not in this server's history
2020-12-18 14:03:46.658 CET [unknown] [28787]DETAIL:  This server's history forked from timeline 39 at F/A1023338.

questions:
1. what does this mean ? How this can happen ? Does this mean that for some "point in time" both servers were primary ?
2. in xxx.history files I find the following rows:
43      F/A60000A0      no recovery target specified
44      F/A70000A0      no recovery target specified
45      F/A80000A0      no recovery target specified

again: what does this mean ?

3. general question: can anyone suggest deep explanation of timelines and history files to understand how this postgresql nature works ?



Many thx
Marcin

Re: some questions regarding replication issues and timeline/history files

От
"email2ssk247@gmail.com"
Дата:
Even I have this problem when I had to recover the database failed
switchover.
This is error is new primary server. 

< 2021-06-15 16:05:02.480 CEST > ERROR:  requested starting point
AF/7D000000 on timeline 1 is not in this server's history
< 2021-06-15 16:05:02.480 CEST > DETAIL:  This server's history forked from
timeline 1 at AF/7C0F8D58.



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-general-f1843780.html



Re: some questions regarding replication issues and timeline/history files

От
Mateusz Henicz
Дата:
Do you have "recovery_target_timeline=latest" configured in your recovery.conf or postgresql.conf? Depending on the version you are using, up to 11 recovery.conf and postgresql.conf 12+.

Cheers,
Mateusz

wt., 15 cze 2021, 22:05 użytkownik email2ssk247@gmail.com <email2ssk247@gmail.com> napisał:
Even I have this problem when I had to recover the database failed
switchover.
This is error is new primary server.

< 2021-06-15 16:05:02.480 CEST > ERROR:  requested starting point
AF/7D000000 on timeline 1 is not in this server's history
< 2021-06-15 16:05:02.480 CEST > DETAIL:  This server's history forked from
timeline 1 at AF/7C0F8D58.



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-general-f1843780.html


Re: some questions regarding replication issues and timeline/history files

От
Sudhakaran Srinivasan
Дата:
Yeah it is latest.

I am using Postgres 9.6.

Thanks!

Sudhakaran

On Tue, 15 Jun 2021 at 10:42 PM, Mateusz Henicz <mateuszhenicz@gmail.com> wrote:
Do you have "recovery_target_timeline=latest" configured in your recovery.conf or postgresql.conf? Depending on the version you are using, up to 11 recovery.conf and postgresql.conf 12+.

Cheers,
Mateusz

wt., 15 cze 2021, 22:05 użytkownik email2ssk247@gmail.com <email2ssk247@gmail.com> napisał:
Even I have this problem when I had to recover the database failed
switchover.
This is error is new primary server.

< 2021-06-15 16:05:02.480 CEST > ERROR:  requested starting point
AF/7D000000 on timeline 1 is not in this server's history
< 2021-06-15 16:05:02.480 CEST > DETAIL:  This server's history forked from
timeline 1 at AF/7C0F8D58.


Re: some questions regarding replication issues and timeline/history files

От
Kyotaro Horiguchi
Дата:
At Tue, 15 Jun 2021 07:05:07 -0700 (MST), "email2ssk247@gmail.com" <email2ssk247@gmail.com> wrote in 
> Even I have this problem when I had to recover the database failed
> switchover.
> This is error is new primary server. 
> 
> < 2021-06-15 16:05:02.480 CEST > ERROR:  requested starting point
> AF/7D000000 on timeline 1 is not in this server's history
> < 2021-06-15 16:05:02.480 CEST > DETAIL:  This server's history forked from
> timeline 1 at AF/7C0F8D58.

Your old primary looks like having continued running beyond 7D000000
after the old standby promoted at 7C0F8D58.  In short, the new standby
experienced a diverged history from the new primary.

You can use pg_rewind to adust the new standby sever in that case.


FYI, you can reproduce the error by the folowing steps.

1. create a primary  (A)
2. create a standby  (B) connecting to A.
3. promote B.

4. connecting to A and run the following commands.

   =# select pg_switch_wal(); checkpoint;

5. stop A, then add primary_conninfo connecting to B to the conf file
   of A,then create the standby.signal file in the data directory of
   A.

6. You will get the similar error.


To recover from the sitaution, run pg_rewind like the follows, for example.

$ pg_rewind --target_pgdata=<datadir of A> --target-server='connstr to B'
pg_rewind: servers diverged at WAL location 0/3000060 on timeline 1
pg_rewind: rewinding from last common checkpoint at 0/2000060 on timeline 1


regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center