Logical replication & oldest XID.

Поиск
Список
Период
Сортировка
От Konstantin Knizhnik
Тема Logical replication & oldest XID.
Дата
Msg-id 574DA53E.1010806@postgrespro.ru
обсуждение исходный текст
Список pgsql-hackers
Hi,

We are using logical replication in multimaster and are faced with some 
interesting problem with "frozen" procArray->replication_slot_xmin.
This variable is adjusted by ProcArraySetReplicationSlotXmin which is 
invoked by ReplicationSlotsComputeRequiredXmin, which
is in turn is called by LogicalConfirmReceivedLocation. If transactions 
are executed at all nodes of multimaster, then everything works fine: 
replication_slot_xmin is advanced. But if we send transactions only to 
one multimaster node and broadcast this changes to other nodes, then no 
data is send through replications slot at this nodes. No data sends - no 
confirmations, LogicalConfirmReceivedLocation is not called and 
procArray->replication_slot_xmin preserves original value 599.

As a result GetOldestXmin function always returns 599, so autovacuum is 
actually blocked and our multimaster is not able to perform cleanup of 
XID->CSN map, which cause shared memory overflow. This situation happens 
only when write transactions are sent only to one node or if there are 
no write transactions at all.

Before implementing some workaround (for example forces all of 
ReplicationSlotsComputeRequiredXmin), I want to understand if it is real 
problem of logical replication or we are doing something wrong? BDR 
should be faced with the same problem if all updates are performed from 
one node...

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tomas Vondra
Дата:
Сообщение: Re: Re: PATCH: Split stats file per database WAS: autovacuum stress-testing our system
Следующее
От: Robert Haas
Дата:
Сообщение: Re: Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold <