[MASSMAIL]Managing replication lag

Поиск
Список
Период
Сортировка
От Jonathan Schaeffer
Тема [MASSMAIL]Managing replication lag
Дата
Msg-id 5c154aaf-7947-4852-a1fe-f50292a755c4@univ-grenoble-alpes.fr
обсуждение исходный текст
Список pgsql-admin
Hello,

I have a postgresql instance in primary/standby mode whith pgbackrest 
asynchronous WAL shipping in between.

Last week I did an upgrade do postgresql-16 in debian 12.

Since then, the standby server has problems catching up with the 
primary. The lag is increasing.

I published the lag graph here: https://imgur.com/a/csyBWFP

I tried to search following leads without success:
- large transactions : I identified 2 last longing transactions, killed 
them, no success
- network issue: both servers are close to each other in the network 
topology, no lag detected there
- hardware configuration delta: both servers store they WAL on SSD. 
Standy server has a bit less RAM (312G vs 440G) and less CPU (20 vs 40), 
but there is no load nor high write rate in my monitoring system.

In the past I already experienced this issue on the same instance 
(postgresql 14) which went away by itself after some weeks.

Can you give me some advices on what else to check ?

Here is the postgresql configuration of the standby:

```
restore_command = 'pgbackrest --stanza=production archive-get %f "%p"'
listen_addresses = '*'
max_connections = 200
tcp_keepalives_idle = 300
shared_buffers = 64GB
temp_buffers = 16MB
work_mem = 32MB
maintenance_work_mem = 4GB
dynamic_shared_memory_type = posix
checkpoint_timeout = 900
max_wal_size = 6GB
min_wal_size = 256MB
checkpoint_completion_target = 0.9
max_wal_senders = 10
wal_keep_size = 128
max_replication_slots = 10
random_page_cost = 1.2
effective_cache_size = 48GB
log_destination = 'syslog'
logging_collector = off
syslog_facility = 'LOCAL0'
syslog_ident = 'postgres'
syslog_sequence_numbers = on
syslog_split_messages = on
log_min_duration_statement = 1000
log_checkpoints = on
log_connections = on
log_disconnections = on
log_lock_waits = on
log_temp_files = 0
log_autovacuum_min_duration = 0
log_error_verbosity = default
log_line_prefix = 'db=%d,user=%u,app=%a,client=%h '
log_timezone = 'Europe/Paris'
track_activities = on
track_counts = on
track_functions = pl
autovacuum = on
datestyle = 'iso, mdy'
timezone = 'UTC'
lc_messages = 'en_US.UTF-8'
lc_monetary = 'en_US.UTF-8'
lc_numeric = 'en_US.UTF-8'
lc_time = 'en_US.UTF-8'
default_text_search_config = 'pg_catalog.english'
shared_preload_libraries = 'timescaledb, pg_stat_statements'
archive_mode = on
archive_command = 'pgbackrest --stanza=production archive-push %p'
archive_timeout = 30
hot_standby = on
```

Kind regards,
-- 
Jonathan Schaeffer
Observatoire des Sciences de l'Univers de Grenoble
Responsable technique Résif-DC
🏢 Isterre, bureau 035, 1381 rue de la Piscine 38610 GIERES
✆ +33 4 76 63 51 37

Вложения

В списке pgsql-admin по дате отправления:

Предыдущее
От: jaya kumar
Дата:
Сообщение: [MASSMAIL]Table DML status check per day
Следующее
От: Daulat
Дата:
Сообщение: [MASSMAIL]pgBadger with Pgaudit