PostgreSQL with Patroni not replicating to all nodes after adding 3rd node (another secondary)

Поиск
Список
Период
Сортировка
От Zb B
Тема PostgreSQL with Patroni not replicating to all nodes after adding 3rd node (another secondary)
Дата
Msg-id CAKwARkYUvo-tXp3EWVkUywZCVHNruooxeotLbO25g=qjJN_nww@mail.gmail.com
обсуждение исходный текст
Список pgsql-general
Hi,
I am new to Patroni and PostgreSQL.We have set up a cluster with etcd (3 nodes), Patroni (2 nodes) and PostgreSQL (2 nodes) with replication from primary to secondary.In SYNC mode. Seemed to work fine. Then I added a third DB node without Patroni - just to replicate the data from the primary using:
1) added another slot in patroni.yml:
slots:
  bdc2b:
    type: physical

2) used
pg_basebackup -v -R -h 10.17.5.211,10.17.5.83 -U replication --slot=bdc2b -D 14/data

As a result the primary DB was showing two replication slots and the Patroni cluster looked healthy by executing:
patronictl -c /etc/patroni/patroni.yml list

(the Leader and replica were running)

But when I started my remote test application that was executing small insert transactions I noticed the records are replicated to the 3rd node only (the secondary without Patroni). They are not replicated to secondary node (the Replica with Patroni)
Some debugging using
journalctl -f
shows that the replica is not healthy and after a while the replication slot becomes inactive. See the log below:

Jun 22 08:06:35 xyzd3riardb02 patroni[12495]: 2022-06-22 08:06:35,280 INFO: Got response from xyzd3riardb01 http://10.17.5.211:8008/patroni: {"state": "running", "postmaster_start_time": "2022-06-22 05:05:37.382607-04:00", "role": "master", "server_version": 140004, "xlog": {"location": 117558448}, "timeline": 4, "replication": [{"usename": "replication", "application_name": "test1b", "client_addr": "10.17.5.56", "state": "streaming", "sync_state": "async", "sync_priority": 0}, {"usename": "replication", "application_name": "xyzd3riardb02", "client_addr": "10.17.5.83", "state": "streaming", "sync_state": "sync", "sync_priority": 1}], "dcs_last_seen": 1655899566, "database_system_identifier": "7111967488904966919", "patroni": {"version": "2.1.4", "scope": "test1b"}}
Jun 22 08:06:35 xyzd3riardb02 patroni[12495]: 2022-06-22 08:06:35,375 WARNING: Master (xyzd3riardb01) is still alive
Jun 22 08:06:35 xyzd3riardb02 patroni[12495]: server signaled
Jun 22 08:06:35 xyzd3riardb02 patroni[12495]: 2022-06-22 08:06:35,400 INFO: following a different leader because i am not the healthiest node
Jun 22 08:07:05 xyzd3riardb02 patroni[12495]: 2022-06-22 08:07:05,279 INFO: Got response from xyzd3riardb01 http://10.17.5.211:8008/patroni: {"state": "running", "postmaster_start_time": "2022-06-22 05:05:37.382607-04:00", "role": "master", "server_version": 140004, "xlog": {"location": 117558448}, "timeline": 4, "replication": [{"usename": "replication", "application_name": "test1b", "client_addr": "10.17.5.56", "state": "streaming", "sync_state": "async", "sync_priority": 0}], "dcs_last_seen": 1655899596, "database_system_identifier": "7111967488904966919", "patroni": {"version": "2.1.4", "scope": "test1b"}}
Jun 22 08:07:05 xyzd3riardb02 patroni[12495]: 2022-06-22 08:07:05,374 WARNING: Master (xyzd3riardb01) is still alive
Jun 22 08:07:05 xyzd3riardb02 patroni[12495]: 2022-06-22 08:07:05,393 INFO: following a different leader because i am not the healthiest node

But the Patroni cluster still looks healthy after executing
patronictl -c /etc/patroni/patroni.yml list

while not replicating the records to the replica.
What can be the reason? Where to look for the problem?

Thanks,

Zbigniew

В списке pgsql-general по дате отправления:

Предыдущее
От: "Mahendrakar, Prabhakar - Dell Team"
Дата:
Сообщение: RE: Postgresql error : PANIC: could not locate a valid checkpoint record
Следующее
От: Tomas Pospisek
Дата:
Сообщение: ERROR: new collation (en_US.UTF-8) is incompatible with the collation of the template database (en_US.utf-8)