subscriptioncheck failure

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема subscriptioncheck failure
Дата
Msg-id CAA4eK1+uW1UGDHDz-HWMHMen76mKP7NJebOTZN4uwbyMjaYVww@mail.gmail.com
обсуждение исходный текст
Ответы Re: subscriptioncheck failure  (vignesh C <vignesh21@gmail.com>)
Список pgsql-hackers
I noticed $SUBJECT in curculio.
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=curculio&dt=2021-05-11%2018%3A30%3A23

The failure test report:
t/020_messages.pl                (Wstat: 7424 Tests: 1 Failed: 0)
  Non-zero exit status: 29
  Parse errors: Bad plan.  You planned 5 tests but ran 1.
Files=21, Tests=149, 79 wallclock secs ( 0.03 usr  0.03 sys + 22.36
cusr  9.47 csys = 31.89 CPU)
Result: FAIL
Makefile:21: recipe for target 'check' failed
gmake: *** [check] Error 1

As far as I can understand the failure happens due to error:
2021-05-11 20:41:38.356 CEST [31177:4] 020_messages.pl ERROR:
replication slot "tap_sub" is active for PID 7354
2021-05-11 20:41:38.356 CEST [31177:5] 020_messages.pl STATEMENT:
SELECT get_byte(data, 1), encode(substr(data, 11, 8), 'escape')
FROM pg_logical_slot_peek_binary_changes('tap_sub', NULL, NULL,
'proto_version', '1',
'publication_names', 'tap_pub',
'messages', 'true')
OFFSET 1 LIMIT 1

As per my initial analysis, this happens because in the test after
creating a subscription, we are not waiting for the subscriber to
catch up. I think there is a narrow window where it is possible that
after creating a subscription, the apply worker starts replication
even after we have done Alter Subscription .. DISABLE. This is also
visible in buildfarm logs:

2021-05-11 20:41:38.345 CEST [4046:9] 020_messages.pl STATEMENT:
SELECT get_byte(data, 0)
FROM pg_logical_slot_peek_binary_changes('tap_sub', NULL, NULL,
'proto_version', '1',
'publication_names', 'tap_pub',
'messages', 'true')
...

2021-05-11 20:41:38.354 CEST [7354:6] tap_sub LOG:  received
replication command: START_REPLICATION SLOT "tap_sub" LOGICAL 0/0
(proto_version '2', publication_names '"tap_pub"')

Notice that we have checked the messages before receiving the start
replication command from the subscriber. Now, if we would have waited
for the subscription to catch up then our check in pg_stat_replication
wouldn't have given a false positive. I have noticed that we do have
such wait in other subscription tests but somehow missed it here.

Thoughts?

-- 
With Regards,
Amit Kapila.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bharath Rupireddy
Дата:
Сообщение: Re: RFC: Logging plan of the running query
Следующее
От: Dilip Kumar
Дата:
Сообщение: Re: RFC: Logging plan of the running query