Re: Excessive number of replication slots for 12->14 logical replication

Поиск
Список
Период
Сортировка
От hubert depesz lubaczewski
Тема Re: Excessive number of replication slots for 12->14 logical replication
Дата
Msg-id 20220718094330.GA24063@depesz.com
обсуждение исходный текст
Ответ на Re: Excessive number of replication slots for 12->14 logical replication  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: Excessive number of replication slots for 12->14 logical replication  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-bugs
On Mon, Jul 18, 2022 at 09:07:35AM +0530, Amit Kapila wrote:
> As per my understanding, each subscription will use three slots (one
> for apply worker and one for each of the sync workers) in your case in
> the sync phase. So, in your case, if you have 10 subscriptions then
> ideally it should be no more than 30. Can you please check and share
> the subscriber logs to see if there are any errors in the initial sync
> phase? Also, please confirm the number of subscriptions you create on
> the subscriber?

Ok. In next test I did:

@09:21:16 AM UTC, on source
All the "create publication" for focal14_1 to focal14_10 sets.

@09:21:53 (immediately after), on recipient:
create subscription focal14_1 connection 'host=10.10.10.10 port=5432 user=upgrayedd dbname=dbname' publication
focal14_1;
create subscription focal14_2 connection 'host=10.10.10.10 port=5432 user=upgrayedd dbname=dbname' publication
focal14_2;
create subscription focal14_3 connection 'host=10.10.10.10 port=5432 user=upgrayedd dbname=dbname' publication
focal14_3;
create subscription focal14_4 connection 'host=10.10.10.10 port=5432 user=upgrayedd dbname=dbname' publication
focal14_4;
create subscription focal14_5 connection 'host=10.10.10.10 port=5432 user=upgrayedd dbname=dbname' publication
focal14_5;
create subscription focal14_6 connection 'host=10.10.10.10 port=5432 user=upgrayedd dbname=dbname' publication
focal14_6;
create subscription focal14_7 connection 'host=10.10.10.10 port=5432 user=upgrayedd dbname=dbname' publication
focal14_7;
create subscription focal14_8 connection 'host=10.10.10.10 port=5432 user=upgrayedd dbname=dbname' publication
focal14_8;
create subscription focal14_9 connection 'host=10.10.10.10 port=5432 user=upgrayedd dbname=dbname' publication
focal14_9;
create subscription focal14_10 connection 'host=10.10.10.10 port=5432 user=upgrayedd dbname=dbname' publication
focal14_10;

This script has ened at 09:21:56

First error:
#v+
2022-07-18 09:22:07.046 UTC,,,4145917,,62d5263f.3f42fd,2,,2022-07-18 09:22:07 UTC,28/21641,1219146,ERROR,53400,"could
notfind free replication state slot for replication origin with OID 51",,"Increase max_replication_slots and try
again.",,,,,,,"","logicalreplication worker",,0
 
#v-

Nothing else errored out before, no warning, no fatals.

from the first ERROR I was getting them in the range of 40-70 per minute.

At the same time I was logging data from `select now(), * from pg_replication_slots`, every 2 seconds.

Then, i grouped it into 4 groups:
slots with name focal* with active = true
slots with name focal* with active = false
slots with name *sync* with active = true
slots with name *sync* with active = false

How it looked in time:
09:21:54:  5 total; focal:  1 active,  2 inactive; sync:  1 active,  1 inactive
09:21:56: 11 total; focal:  1 active,  9 inactive; sync:  1 active,  0 inactive
09:21:58: 11 total; focal:  1 active,  8 inactive; sync:  1 active,  1 inactive
09:22:00: 23 total; focal: 10 active,  0 inactive; sync:  0 active, 13 inactive
09:22:03: 28 total; focal: 10 active,  0 inactive; sync:  9 active,  9 inactive
09:22:05: 21 total; focal: 10 active,  0 inactive; sync:  7 active,  4 inactive
09:22:07: 20 total; focal: 10 active,  0 inactive; sync:  7 active,  3 inactive
09:22:09: 27 total; focal: 10 active,  0 inactive; sync:  9 active,  8 inactive
09:22:11: 30 total; focal: 10 active,  0 inactive; sync: 10 active, 10 inactive
09:22:13: 49 total; focal: 10 active,  0 inactive; sync:  5 active, 34 inactive
09:22:15: 50 total; focal: 10 active,  0 inactive; sync:  0 active, 40 inactive
09:22:17: 49 total; focal: 10 active,  0 inactive; sync:  0 active, 39 inactive
09:22:19: 44 total; focal: 10 active,  0 inactive; sync:  1 active, 33 inactive
09:22:21: 43 total; focal: 10 active,  0 inactive; sync:  2 active, 31 inactive
09:22:23: 44 total; focal: 10 active,  0 inactive; sync:  1 active, 33 inactive
09:22:25: 43 total; focal: 10 active,  0 inactive; sync:  1 active, 32 inactive

So, it looks that there are up to 10 focal slots, all active, and then there are sync slots with weirdly high counts
forinactive ones.
 

At most, I had 11 active sync slots.

Looks like some kind of timing issue, which would be inline with what
Kyotaro Horiguchi wrote initially.

The thing is that after some time, the data *gets* replicated, and it
seems to be full on publication side, but the errors are unnerving :)

depesz



В списке pgsql-bugs по дате отправления:

Предыдущее
От: hubert depesz lubaczewski
Дата:
Сообщение: Re: Excessive number of replication slots for 12->14 logical replication
Следующее
От: PG Bug reporting form
Дата:
Сообщение: BUG #17554: when i use rule on table which have serial column, the nextval exec twice.