Re: DETAIL: pg_rewind: servers diverged at WAL location 0/9000000 on timeline 1
От | Kumar, Devesh |
---|---|
Тема | Re: DETAIL: pg_rewind: servers diverged at WAL location 0/9000000 on timeline 1 |
Дата | |
Msg-id | CACMEH=4UG9_VGefOiwizOqrmhrNaSipNwiAQKcvh-5if5BmQGg@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: DETAIL: pg_rewind: servers diverged at WAL location 0/9000000 on timeline 1 (Laurenz Albe <laurenz.albe@cybertec.at>) |
Список | pgsql-bugs |
Hello Laurenz
Thanks for the response. I am putting the details as below:
Primary repmgr.conf Details
![image.png](/media/maillist_attaches/pgsql-bugs/2024/04/29/CACMEH=4UG9_VGefOiwizOqrmhrNaSipNwiAQKcvh-5if5BmQGg@mail.gmail.com/image.png)
Secondary repmgr.conf Details
![image.png](/media/maillist_attaches/pgsql-bugs/2024/04/29/CACMEH=4UG9_VGefOiwizOqrmhrNaSipNwiAQKcvh-5if5BmQGg@mail.gmail.com/image-1.png)
Thanks for the response. I am putting the details as below:
Primary repmgr.conf Details
![image.png](/media/maillist_attaches/pgsql-bugs/2024/04/29/CACMEH=4UG9_VGefOiwizOqrmhrNaSipNwiAQKcvh-5if5BmQGg@mail.gmail.com/image.png)
Secondary repmgr.conf Details
![image.png](/media/maillist_attaches/pgsql-bugs/2024/04/29/CACMEH=4UG9_VGefOiwizOqrmhrNaSipNwiAQKcvh-5if5BmQGg@mail.gmail.com/image-1.png)
Failover steps:
We stopped the primary server pg service and repmgrd automatically did the failover to standby and made standby as the new primary.
See the below status after failover
![image.png](/media/maillist_attaches/pgsql-bugs/2024/04/29/CACMEH=4UG9_VGefOiwizOqrmhrNaSipNwiAQKcvh-5if5BmQGg@mail.gmail.com/image-2.png)
Failback steps;
1. We executed a checkpoint on the new primary( originally standby ).
2. We ran the below node rejoin command with --dry-run
repmgr node rejoin -f /opt/postgresql/15.6/bin/repmgr.conf -d 'host=10.29.97.241 port=5432 user=repmgr dbname=repmgr' --force-rewind --config-files=postgresql.conf,postgresql.local.conf,pg_hba.conf -v --dry-run ///try to check if original_primary is eligible to rejoin
NOTICE: rejoin target is node "d-dba-pg-rnh9" (ID: 2)
INFO: replication connection to the rejoin target node was successful
INFO: local and rejoin target system identifiers match
DETAIL: system identifier is 7360952088605465701
NOTICE: pg_rewind execution required for this node to attach to rejoin target node 2
DETAIL: rejoin target server's timeline 2 forked off current database system timeline 1 before current recovery point 0/9000028
INFO: prerequisites for using pg_rewind are met
INFO: file "postgresql.conf" would be copied to "/tmp/repmgr-config-archive-d-dba-pg-0ptt/postgresql.conf"
WARNING: specified file "/pgresdata101/data/postgresql.local.conf" not found, skipping
INFO: file "pg_hba.conf" would be copied to "/tmp/repmgr-config-archive-d-dba-pg-0ptt/pg_hba.conf"
INFO: pg_rewind would now be executed
DETAIL: pg_rewind command is:
/opt/postgresql/pg/bin/pg_rewind -D '/pgresdata101/data' --source-server='host=10.29.97.241 port=5432 user=repmgr dbname=repmgr connect_timeout=2'
INFO: prerequisites for executing NODE REJOIN are met
3. executed node rejoin command
repmgr node rejoin -f /opt/postgresql/15.6/bin/repmgr.conf -d 'host=10.29.97.241 port=5432 user=repmgr dbname=repmgr' --force-rewind --config-files=postgresql.conf,postgresql.local.conf,pg_hba.conf -v
NOTICE: using provided configuration file "/opt/postgresql/15.6/bin/repmgr.conf"
DEBUG: server version number is: 150000
DEBUG: set_config():
SET synchronous_commit TO 'local'
DEBUG: get_primary_node_id():
SELECT node_id FROM repmgr.nodes WHERE type = 'primary' AND active IS TRUE
DEBUG: get_node_record():
SELECT n.node_id, n.type, n.upstream_node_id, n.node_name, n.conninfo, n.repluser, n.slot_name, n.location, n.priority, n.active, n.config_file, '' AS upstream_node_name, NULL AS attached FROM repmgr.nodes n WHERE n.node_id = 2
NOTICE: rejoin target is node "d-dba-pg-rnh9" (ID: 2)
DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=10.29.97.241 port=5432 fallback_application_name=repmgr options=-csearch_path="
DEBUG: set_config():
SET synchronous_commit TO 'local'
DEBUG: get_recovery_type(): SELECT pg_catalog.pg_is_in_recovery()
DEBUG: get_node_record():
SELECT n.node_id, n.type, n.upstream_node_id, n.node_name, n.conninfo, n.repluser, n.slot_name, n.location, n.priority, n.active, n.config_file, '' AS upstream_node_name, NULL AS attached FROM repmgr.nodes n WHERE n.node_id = 1
DEBUG: local timeline: 1; rejoin target timeline: 2
DEBUG: get_timeline_history():
TIMELINE_HISTORY 2
DEBUG: local tli: 1; local_xlogpos: 0/9000028; follow_target_history->tli: 1; follow_target_history->end: 0/9000000
NOTICE: pg_rewind execution required for this node to attach to rejoin target node 2
DETAIL: rejoin target server's timeline 2 forked off current database system timeline 1 before current recovery point 0/9000028
DEBUG: guc_set():
SELECT true FROM pg_catalog.pg_settings WHERE name = 'full_page_writes' AND setting = 'off'
DEBUG: guc_set():
SELECT true FROM pg_catalog.pg_settings WHERE name = 'wal_log_hints' AND setting = 'on'
INFO: prerequisites for using pg_rewind are met
DEBUG: using archive directory "/tmp/repmgr-config-archive-d-dba-pg-0ptt"
DEBUG: copying "postgresql.conf" to "/tmp/repmgr-config-archive-d-dba-pg-0ptt/postgresql.conf"
WARNING: specified file "/pgresdata101/data/postgresql.local.conf" not found, skipping
DEBUG: copying "pg_hba.conf" to "/tmp/repmgr-config-archive-d-dba-pg-0ptt/pg_hba.conf"
INFO: 2 files copied to "/tmp/repmgr-config-archive-d-dba-pg-0ptt"
NOTICE: executing pg_rewind
DETAIL: pg_rewind command is "/opt/postgresql/pg/bin/pg_rewind -D '/pgresdata101/data' --source-server='host=10.29.97.241 port=5432 user=repmgr dbname=repmgr connect_timeout=2'"
DEBUG: executing:
/opt/postgresql/pg/bin/pg_rewind -D '/pgresdata101/data' --source-server='host=10.29.97.241 port=5432 user=repmgr dbname=repmgr connect_timeout=2' 2>/tmp/repmgr_command.wgVGPS
DEBUG: result of command was 1 (256)
DEBUG: local_command(): output returned was:
pg_rewind: servers diverged at WAL location 0/9000000 on timeline 1
pg_rewind: error: could not open file "/pgresdata101/data/pg_wal/000000010000000000000008": No such file or directory
pg_rewind: error: could not find previous WAL record at 0/802B668
ERROR: pg_rewind execution failed
DETAIL: pg_rewind: servers diverged at WAL location 0/9000000 on timeline 1
pg_rewind: error: could not open file "/pgresdata101/data/pg_wal/000000010000000000000008": No such file or directory
pg_rewind: error: could not find previous WAL record at 0/802B668
We stopped the primary server pg service and repmgrd automatically did the failover to standby and made standby as the new primary.
See the below status after failover
![image.png](/media/maillist_attaches/pgsql-bugs/2024/04/29/CACMEH=4UG9_VGefOiwizOqrmhrNaSipNwiAQKcvh-5if5BmQGg@mail.gmail.com/image-2.png)
Failback steps;
1. We executed a checkpoint on the new primary( originally standby ).
2. We ran the below node rejoin command with --dry-run
repmgr node rejoin -f /opt/postgresql/15.6/bin/repmgr.conf -d 'host=10.29.97.241 port=5432 user=repmgr dbname=repmgr' --force-rewind --config-files=postgresql.conf,postgresql.local.conf,pg_hba.conf -v --dry-run ///try to check if original_primary is eligible to rejoin
NOTICE: rejoin target is node "d-dba-pg-rnh9" (ID: 2)
INFO: replication connection to the rejoin target node was successful
INFO: local and rejoin target system identifiers match
DETAIL: system identifier is 7360952088605465701
NOTICE: pg_rewind execution required for this node to attach to rejoin target node 2
DETAIL: rejoin target server's timeline 2 forked off current database system timeline 1 before current recovery point 0/9000028
INFO: prerequisites for using pg_rewind are met
INFO: file "postgresql.conf" would be copied to "/tmp/repmgr-config-archive-d-dba-pg-0ptt/postgresql.conf"
WARNING: specified file "/pgresdata101/data/postgresql.local.conf" not found, skipping
INFO: file "pg_hba.conf" would be copied to "/tmp/repmgr-config-archive-d-dba-pg-0ptt/pg_hba.conf"
INFO: pg_rewind would now be executed
DETAIL: pg_rewind command is:
/opt/postgresql/pg/bin/pg_rewind -D '/pgresdata101/data' --source-server='host=10.29.97.241 port=5432 user=repmgr dbname=repmgr connect_timeout=2'
INFO: prerequisites for executing NODE REJOIN are met
3. executed node rejoin command
repmgr node rejoin -f /opt/postgresql/15.6/bin/repmgr.conf -d 'host=10.29.97.241 port=5432 user=repmgr dbname=repmgr' --force-rewind --config-files=postgresql.conf,postgresql.local.conf,pg_hba.conf -v
NOTICE: using provided configuration file "/opt/postgresql/15.6/bin/repmgr.conf"
DEBUG: server version number is: 150000
DEBUG: set_config():
SET synchronous_commit TO 'local'
DEBUG: get_primary_node_id():
SELECT node_id FROM repmgr.nodes WHERE type = 'primary' AND active IS TRUE
DEBUG: get_node_record():
SELECT n.node_id, n.type, n.upstream_node_id, n.node_name, n.conninfo, n.repluser, n.slot_name, n.location, n.priority, n.active, n.config_file, '' AS upstream_node_name, NULL AS attached FROM repmgr.nodes n WHERE n.node_id = 2
NOTICE: rejoin target is node "d-dba-pg-rnh9" (ID: 2)
DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=10.29.97.241 port=5432 fallback_application_name=repmgr options=-csearch_path="
DEBUG: set_config():
SET synchronous_commit TO 'local'
DEBUG: get_recovery_type(): SELECT pg_catalog.pg_is_in_recovery()
DEBUG: get_node_record():
SELECT n.node_id, n.type, n.upstream_node_id, n.node_name, n.conninfo, n.repluser, n.slot_name, n.location, n.priority, n.active, n.config_file, '' AS upstream_node_name, NULL AS attached FROM repmgr.nodes n WHERE n.node_id = 1
DEBUG: local timeline: 1; rejoin target timeline: 2
DEBUG: get_timeline_history():
TIMELINE_HISTORY 2
DEBUG: local tli: 1; local_xlogpos: 0/9000028; follow_target_history->tli: 1; follow_target_history->end: 0/9000000
NOTICE: pg_rewind execution required for this node to attach to rejoin target node 2
DETAIL: rejoin target server's timeline 2 forked off current database system timeline 1 before current recovery point 0/9000028
DEBUG: guc_set():
SELECT true FROM pg_catalog.pg_settings WHERE name = 'full_page_writes' AND setting = 'off'
DEBUG: guc_set():
SELECT true FROM pg_catalog.pg_settings WHERE name = 'wal_log_hints' AND setting = 'on'
INFO: prerequisites for using pg_rewind are met
DEBUG: using archive directory "/tmp/repmgr-config-archive-d-dba-pg-0ptt"
DEBUG: copying "postgresql.conf" to "/tmp/repmgr-config-archive-d-dba-pg-0ptt/postgresql.conf"
WARNING: specified file "/pgresdata101/data/postgresql.local.conf" not found, skipping
DEBUG: copying "pg_hba.conf" to "/tmp/repmgr-config-archive-d-dba-pg-0ptt/pg_hba.conf"
INFO: 2 files copied to "/tmp/repmgr-config-archive-d-dba-pg-0ptt"
NOTICE: executing pg_rewind
DETAIL: pg_rewind command is "/opt/postgresql/pg/bin/pg_rewind -D '/pgresdata101/data' --source-server='host=10.29.97.241 port=5432 user=repmgr dbname=repmgr connect_timeout=2'"
DEBUG: executing:
/opt/postgresql/pg/bin/pg_rewind -D '/pgresdata101/data' --source-server='host=10.29.97.241 port=5432 user=repmgr dbname=repmgr connect_timeout=2' 2>/tmp/repmgr_command.wgVGPS
DEBUG: result of command was 1 (256)
DEBUG: local_command(): output returned was:
pg_rewind: servers diverged at WAL location 0/9000000 on timeline 1
pg_rewind: error: could not open file "/pgresdata101/data/pg_wal/000000010000000000000008": No such file or directory
pg_rewind: error: could not find previous WAL record at 0/802B668
ERROR: pg_rewind execution failed
DETAIL: pg_rewind: servers diverged at WAL location 0/9000000 on timeline 1
pg_rewind: error: could not open file "/pgresdata101/data/pg_wal/000000010000000000000008": No such file or directory
pg_rewind: error: could not find previous WAL record at 0/802B668
___________________________
DEVESH KUMAR
Database Admin I – India
M: +91 6366843695
Address: Tridib Building Block B 5th Floor
Bagmane Tech Park CV Raman Nagar,
Bengaluru, 560093, IN
www.cmegroup.com
On Mon, Apr 29, 2024 at 3:37 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
This email is from an external source. Do not click links or open attachments you do not trust. EXERCISE CAUTION.
On Sat, 2024-04-27 at 00:36 +0530, Kumar, Devesh wrote:
> Currently we are working on setting up replication and testing failover scenarios
> and failback. During our testing, failover is getting successful. During Failback,
> when we are reverting the original primary instance as the new standby, we are
> getting pg_rewind errors. Kindly can someone check and let us know.
>
> pg_rewind: servers diverged at WAL location 0/9000000 on timeline 1
> pg_rewind: error: could not open file "/pgresdata101/data/pg_wal/000000010000000000000008": No such file or directory
> pg_rewind: error: could not find previous WAL record at 0/802B668
You should show the exact commands used for failover and failback.
Yours,
Laurenz Albe
Вложения
В списке pgsql-bugs по дате отправления:
Следующее
От: Alexander LakhinДата:
Сообщение: Re: BUG #17855: Uninitialised memory used when the name type value processed in binary mode of Memoize