Re: conflict with recovery when delay is gone

Поиск
Список
Период
Сортировка
От Radoslav Nedyalkov
Тема Re: conflict with recovery when delay is gone
Дата
Msg-id CANhtRiY30QiOWn1AFgiiAaAiwyaAip73fzz=vGZeCHeEV18Srg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: conflict with recovery when delay is gone  (Laurenz Albe <laurenz.albe@cybertec.at>)
Ответы Re: conflict with recovery when delay is gone  (Radoslav Nedyalkov <rnedyalkov@gmail.com>)
Список pgsql-general


On Fri, Nov 13, 2020 at 7:37 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Fri, 2020-11-13 at 15:24 +0200, Radoslav Nedyalkov wrote:
> On a very busy master-standby setup which runs typical olap processing -
> long living , massive writes statements,  we're getting on the standby:
>
>  ERROR:  canceling statement due to conflict with recovery
>  FATAL:  terminating connection due to conflict with recovery
>
> The weird thing is that cancellations happen usually after standby has experienced
> some huge delay(2h), still not at the allowed maximum(3h). Even recently run statements
> got cancelled when the delay is already at zero.
>
> Sometimes the situation got relaxed after an hour or so.
> Restarting the server instantly helps.
>
> It is pg11.8, centos7, hugepages, shared_buffers 196G from 748G.
>
> What phenomenon could we be facing?

Hard to say.  Perhaps an unusual kind of replication conflict?

What is in "pg_stat_database_conflicts" on the standby server?

db01=# select * from pg_stat_database_conflicts;
 datid |  datname  | confl_tablespace | confl_lock | confl_snapshot | confl_bufferpin | confl_deadlock
-------+-----------+------------------+------------+----------------+-----------------+----------------
 13877 | template0 |                0 |          0 |              0 |               0 |              0
 16400 | template1 |                0 |          0 |              0 |               0 |              0
 16402 | postgres  |                0 |          0 |              0 |               0 |              0
 16401 | db01      |                0 |          0 |             51 |               0 |              0
(4 rows)

On a freshly restarted standby we've just got similar behaviour after a 2 hours delay and a slow catch-up.
confl_snapshots is 51 and we have exactly the same number cancelled statements.


В списке pgsql-general по дате отправления:

Предыдущее
От: Magnus Hagander
Дата:
Сообщение: Re: Issue upgrading from 9.5 to 13 with pg_upgrade: "connection to database failed: FATAL: database "template1" does not exist"
Следующее
От: Adrian Klaver
Дата:
Сообщение: Re: Issue upgrading from 9.5 to 13 with pg_upgrade: "connection to database failed: FATAL: database "template1" does not exist"