Thank you for your suggestion. I afraid this approach is not suitable for me. As a rule my postgresql log on subscriber side contains a bunch of the following entries:
ERROR: terminating logical replication worker due to timeout 00000 LOG: worker process: logical replication worker for subscription 24578 (PID 6217) exited with exit code 1 How should I handle this situation? As I understand this is quite normal situation. But why is severity for it an ERROR ?
I have another assumption. Could you correct me if I am wrong. I found out in the source code that logical replication worker termination depends on wal_receiver_timeout paramer. So I propose setting wal_receiver_timeout to 0. In this case I think that monitoring of the following views pg_stat_subscription, pg_publication and pg_stat_replication is enough. In case if there is some problem with connection or with replication pg_stat_replication will show nothing because wal sender will not be working otherwise it will give some information. Am I right? Are there any vulnerabilities in this approach ?
Best regards, Andrei Yahorau
From: Andrei Yahorau/IBA To: pgsql-admin@postgresql.org, Cc: Mikalai Keida/IBA@IBA Date: 10/08/2018 13:05 Subject: Logical replication monitoring
Hello PostgreSQL Community!
I configured logical replication for PostgreSQL 10.4 on 2 machines, set wal_level to logical, created a publication on master node and created a subscription on standby node according to the PostgreSQL documentation. Could you please suggest an approach for replication state monitoring.
According to my experience the monitoring of pg_stat_subscription and pg_publication, pg_replication_slots unfortunately is not enough for this aim. Moreover standby database does not prohibit write operations by default and it can lead to some inconsistency between these databases.
For example a chain of queries as SELECT pg_is_is_recovery(), SELECT * FROM pg_stat_replication and SELECT * FROM pg_stat_wal_receiver provide insight into replication state for hot_standby replication.
So is there a reliable way of replication state monitoring for logical replication?