Logical Replication ERROR reporting issue

Поиск
Список
Период
Сортировка
От Ranjan Gajare
Тема Logical Replication ERROR reporting issue
Дата
Msg-id CACj5rkYQ_A6-h8TcVDuDRuuVUPzGdcTLS=1TOb-n=BB6ey2hMQ@mail.gmail.com
обсуждение исходный текст
Список pgsql-general
Hello Folks,

We are having the issue with Logical Replication in Postgres 10.11 production environment that unable to get around.

Following is the production environment configuration
PostgreSQL Version: 10.11
OS: Ubuntu 16.04.3 LTS (Xenial Xerus)


The error message frequently occurring in the logs of the subscription server is :

LOG:  logical replication apply worker for subscription "<sub_name>" has started
ERROR:  terminating logical replication worker due to timeout
LOG:  background worker "logical replication worker" (PID <pid>) exited with exit code 1
LOG:  logical replication apply worker for subscription "<sub_name>" has started
ERROR:  could not start WAL streaming: ERROR:  replication slot "<slot_name>" is active for PID <pid>
LOG:  worker process: logical replication worker for subscription <sub_oid> (PID <pid>) exited with exit code 1


This results in filling up disk space on master due to too many WAL pending to apply. There are two ERROR messages observed here.

Looking at timeout ERROR we tried to simply increase 'wal_receiver_timeout' to '2min' (1min default). 'wal_sender_timeout' was already '2min'. It resolved the timeout ERROR and surprisingly the other error saying 'replication slot is active for PID' also vanished after that.

Does anyone have any idea how increasing the wal_receiver_timeout relates to 'ERROR:  could not start WAL streaming: ERROR:  replication slot "<slot_name>" is active for PID <pid>' OR is it just a flaw in error reporting?


Thanks for any help!

--
Regards,
Ranjan Gajare

В списке pgsql-general по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: policies and extensions
Следующее
От: Marc Munro
Дата:
Сообщение: Re: policies and extensions