Обсуждение: Lag clarification with Sync Replication

Поиск

Список

Период

Сортировка

Lag clarification with Sync Replication

От

Raj kumar

Дата:

22 мая 2020 г., 04:36:08

I have made a synchronous replication setup with PG12.

During data loading of 200 gig in master node, I found that there is a lag that it shows on the slave. But, ideally for sync replication, lag should be always 0 right.

Because commit never happens in master before updating the slave.
log_delay
------------
871.549671
(1 row) log_delay
------------
871.557246
(1 row) log_delay
-----------
0
(1 row) log_delay
-----------
0
(1 row)

Query Used for checking Standby Replication Lag:

psql -d perfbench -p 5432 -c "SELECT CASE WHEN pg_last_wal_receive_lsn() = pg_last_wal_replay_lsn()
THEN 0
ELSE EXTRACT (EPOCH FROM now() - pg_last_xact_replay_timestamp())
END AS log_delay;"

Setting Used in PG12

cat postgresql.conf |grep synchronous_

synchronous_commit = remote_write # synchronization level;

synchronous_standby_names = 'ANY 1 (standby1,standby2)' # standby servers that provide sync rep

cat postgresql.conf |grep primary_conn

primary_conninfo = 'user=edbrepl host= 10.12.11.101 port=5432 application_name=standby1 '

Thanks,

Raj Kumar

Re: Lag clarification with Sync Replication

От

Scott Ribe

Дата:

22 мая 2020 г., 11:57:46

Your query is unrelated to whether or not the commit has completed in the master.

Re: Lag clarification with Sync Replication

От

Keith

Дата:

22 мая 2020 г., 14:29:48

On Fri, May 22, 2020 at 12:36 AM Raj kumar <rajkumar820999@gmail.com> wrote:

I have made a synchronous replication setup with PG12.
During data loading of 200 gig in master node, I found that there is a lag that it shows on the slave. But, ideally for sync replication, lag should be always 0 right.
Because commit never happens in master before updating the slave.
log_delay
------------
871.549671
(1 row) log_delay
------------
871.557246
(1 row) log_delay
-----------
         0
(1 row) log_delay
-----------
         0
(1 row)

Query Used for checking Standby Replication Lag:
psql -d perfbench -p 5432 -c "SELECT CASE WHEN pg_last_wal_receive_lsn() = pg_last_wal_replay_lsn()
THEN 0
ELSE EXTRACT (EPOCH FROM now() - pg_last_xact_replay_timestamp())
END AS log_delay;"

Setting Used in PG12
cat postgresql.conf |grep synchronous_
synchronous_commit = remote_write                # synchronization level;
synchronous_standby_names = 'ANY 1 (standby1,standby2)'        # standby servers that provide sync rep

cat postgresql.conf |grep primary_conn
primary_conninfo = 'user=edbrepl host= 10.12.11.101 port=5432 application_name=standby1 '

Thanks,
Raj Kumar

For synchronous replication, and any replication really, I would recommend monitoring byte lag from the primary. This is much more accurate and, if the count of replicas goes down from an expected number, you'll know it's been disconnected as well. Monitoring replication from the replica can be done, but it has caveats and shouldn't be the only way it's monitored

Wrote a short post on this a while ago. See relavant updates further down - https://www.keithf4.com/monitoring_streaming_slave_lag/

Re: Lag clarification with Sync Replication

От

Rui DeSousa

Дата:

22 мая 2020 г., 16:48:01

On May 22, 2020, at 12:36 AM, Raj kumar <rajkumar820999@gmail.com> wrote:

. But, ideally for sync replication, lag should be always 0 right.

Incorrect. Synchronous replication means that a commit will not return until it has been safely written to disk on the primary and the replica. That means the transaction is written to WAL file on both primary and replica. On the primary, the transaction is also visible to transactions with a later xmin. On the replica, the transaction has been recorded in the WAL; it still needs to get applied to the database for it to become visible to read transactions.

Also, changes could occur faster on the primary where the replica could be lagging behind — either by lagging in receiving the updates or writing the information to the WAL. What will happen is when the commit is iussed; the session will hang until replica catches up and writes the all the data need safely to the WAL.

I would look at pg_stat_replication view on the primary/upstream provider. Here’s a query that make it easier.

Sent_lag — data that still needs to be sent to the replica

Flush_lag —data that still has not been fsync’d to WAL files yet

Replay_lag — data that still need to be applied to the database so that it can be queried

select pg_stat_replication.client_addr

, pg_stat_replication.application_name

, pg_stat_replication.sync_priority

, pg_stat_replication.sync_state

, pg_stat_replication.state

, case pg_is_in_recovery()

when true then pg_size_pretty(pg_wal_lsn_diff(pg_last_wal_receive_lsn(), sent_lsn))

else pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), sent_lsn))

end as sent_lag

, case pg_is_in_recovery()

when true then pg_size_pretty(pg_wal_lsn_diff(pg_last_wal_receive_lsn(), flush_lsn))

else pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), flush_lsn))

end as flush_lag

, case pg_is_in_recovery()

when true then pg_size_pretty(pg_wal_lsn_diff(pg_last_wal_receive_lsn(), replay_lsn))

else pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn))

end as replay_lag

from pg_stat_replication

;

Re: Lag clarification with Sync Replication

От

Raj kumar

Дата:

24 мая 2020 г., 15:19:04

Thanks Rui, Keith and Scott for nice explanation and blog :)

Thanks,

Raj

On Fri, May 22, 2020 at 10:18 PM Rui DeSousa <rui@crazybean.net> wrote:

On May 22, 2020, at 12:36 AM, Raj kumar <rajkumar820999@gmail.com> wrote:

. But, ideally for sync replication, lag should be always 0 right.

Incorrect. Synchronous replication means that a commit will not return until it has been safely written to disk on the primary and the replica. That means the transaction is written to WAL file on both primary and replica. On the primary, the transaction is also visible to transactions with a later xmin. On the replica, the transaction has been recorded in the WAL; it still needs to get applied to the database for it to become visible to read transactions.

Also, changes could occur faster on the primary where the replica could be lagging behind — either by lagging in receiving the updates or writing the information to the WAL. What will happen is when the commit is iussed; the session will hang until replica catches up and writes the all the data need safely to the WAL.

I would look at pg_stat_replication view on the primary/upstream provider. Here’s a query that make it easier.

Sent_lag — data that still needs to be sent to the replica
Flush_lag —data that still has not been fsync’d to WAL files yet
Replay_lag — data that still need to be applied to the database so that it can be queried

select pg_stat_replication.client_addr
, pg_stat_replication.application_name
, pg_stat_replication.sync_priority
, pg_stat_replication.sync_state
, pg_stat_replication.state
, case pg_is_in_recovery()
when true then pg_size_pretty(pg_wal_lsn_diff(pg_last_wal_receive_lsn(), sent_lsn))
else pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), sent_lsn))
end as sent_lag
, case pg_is_in_recovery()
when true then pg_size_pretty(pg_wal_lsn_diff(pg_last_wal_receive_lsn(), flush_lsn))
else pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), flush_lsn))
end as flush_lag
, case pg_is_in_recovery()
when true then pg_size_pretty(pg_wal_lsn_diff(pg_last_wal_receive_lsn(), replay_lsn))
else pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn))
end as replay_lag
from pg_stat_replication
;

Re: Lag clarification with Sync Replication

От

Laurenz Albe

Дата:

25 мая 2020 г., 06:42:07

On Fri, 2020-05-22 at 12:48 -0400, Rui DeSousa wrote:
> > On May 22, 2020, at 12:36 AM, Raj kumar <rajkumar820999@gmail.com> wrote:
> > 
> > . But, ideally for sync replication, lag should be always 0 right.
> 
> Incorrect.  Synchronous replication means that a commit will not return until it has been safely written to disk on
theprimary and the replica.  That means the transaction is written to WAL file on
 
> both primary and replica.  On the primary, the transaction is also visible to transactions with a later xmin.  On the
replica,the transaction has been recorded in the WAL; it still needs to get
 
> applied to the database for it to become visible to read transactions.

If you set "synchronous_commit = remote_apply", the commit will only return
when the change has been replayed on the synchronous standby server.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Lag clarification with Sync Replication

Lag clarification with Sync Replication

Re: Lag clarification with Sync Replication

Re: Lag clarification with Sync Replication

Re: Lag clarification with Sync Replication

Re: Lag clarification with Sync Replication

Re: Lag clarification with Sync Replication