pg_stat_replication when standby is unreachable

Поиск
Список
Период
Сортировка
От Abhishek Rai
Тема pg_stat_replication when standby is unreachable
Дата
Msg-id CA+sC4q6SBUv_U9oFsLBN07XhM5XjXbvQd=BsQD7otZBzj6oo0Q@mail.gmail.com
обсуждение исходный текст
Ответы Re: pg_stat_replication when standby is unreachable
Re: pg_stat_replication when standby is unreachable
Список pgsql-hackers
Hello Postgres gurus,

I'm writing a thin clustering layer on top of Postgres using the synchronous replication feature.  The goal is to enable HA and survive permanent loss of a single node.  Using an external coordinator (Zookeeper), one of the nodes is elected as the primary.  The primary node then picks up another healthy node as its standby, and starts serving.  Thereafter, the cluster monitors the primary and the standby,  and triggers a re-election if itself or its standby go down.

Detecting primary health is easy.  But what is the best way to know if the standby is live?  Since this is not a hot-standby, I cannot send queries to it.  Currently, I'm sending the following query to the primary:

  SELECT * from pg_stat_replication();

I've noticed that when I terminate the standby (cleanly or through kill -9), the result of above function goes from 1 row to zero rows.  The result comes back to 1 row when the standby restarts and reconnects.  I was wondering if there is any kind of guarantee about the results of pg_stat_replication as the standby suffers a network partition, and/or restarts and reconnects with the primary.  Are there any parameters that control this behavior?

I tried looking at src/backend/replication/walsender.c/WalSndLoop() but am still not clear on the expected behavior.

Thanks for your time,
Abhishek

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: all_visible replay aborting due to uninitialized pages
Следующее
От: Andres Freund
Дата:
Сообщение: Re: all_visible replay aborting due to uninitialized pages