Test failures of 100_bugs.pl

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Test failures of 100_bugs.pl
Дата
Msg-id 20230124032327.fakewkdgrzmkrr2l@awork3.anarazel.de
обсуждение исходный текст
Ответы Re: Test failures of 100_bugs.pl  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
Hi,

cfbot, the buildfarm and locally I have seen 100_bugs.pl fail
occasionally. Just rarely enough that I never got around to looking into it
for real.

Just now there was another failure on master:
https://cirrus-ci.com/task/5279589287591936

[01:00:49.441] ok 1 - index predicates do not cause crash
[01:00:49.441] ok 2 - update to temporary table without replica identity with FOR ALL TABLES publication
[01:00:49.441] ok 3 - update to unlogged table without replica identity with FOR ALL TABLES publication
[01:00:49.441] # test failed
[01:00:49.441] --- stderr ---
[01:00:49.441] # poll_query_until timed out executing this query:
[01:00:49.441] # SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');
[01:00:49.441] # expecting this output:
[01:00:49.441] # t
[01:00:49.441] # last actual query output:
[01:00:49.441] # f
[01:00:49.441] # with stderr:
[01:00:49.441] # Tests were run but no plan was declared and done_testing() was not seen.
[01:00:49.441] # Looks like your test exited with 29 just after 3.
[01:00:49.441] 
[01:00:49.441] (test program exited with status code 29)

the regress log:

https://api.cirrus-ci.com/v1/artifact/task/5279589287591936/testrun/build-32/testrun/subscription/100_bugs/log/regress_log_100_bugs
and twoway's log:

https://api.cirrus-ci.com/v1/artifact/task/5279589287591936/testrun/build-32/testrun/subscription/100_bugs/log/100_bugs_twoways.log



We see t2 added to the publication:
2023-01-24 00:57:30.099 UTC [73654][client backend] [100_bugs.pl][7/5:0] LOG:  statement: ALTER PUBLICATION testpub ADD
TABLEt2
 

And that *then* "t" was synchronized:
2023-01-24 00:57:30.102 UTC [73640][logical replication worker] LOG:  logical replication table synchronization worker
forsubscription "testsub", table "t" has finished
 

and then that the refresh was issued:
2023-01-24 00:57:30.128 UTC [73657][client backend] [100_bugs.pl][5/10:0] LOG:  statement: ALTER SUBSCRIPTION testsub
REFRESHPUBLICATION
 

And we see a walsender starting and the query to get the new tables being executed:
2023-01-24 00:57:30.139 UTC [73660][walsender] [testsub][6/8:0] LOG:  statement: SELECT DISTINCT t.schemaname,
t.tablename
 
    , t.attnames
    FROM pg_catalog.pg_publication_tables t
     WHERE t.pubname IN ('testpub')


And that's it, the rest of the time is just polling.


Perhaps wait_for_subscription_sync() should dump the set of rel states to make
something like this more debuggable?


The fact that the synchronization for t finished just before the refresh makes
me wonder if a wakeup or a cache invalidation got lost?

Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Geoghegan
Дата:
Сообщение: Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation
Следующее
От: Zheng Li
Дата:
Сообщение: Re: Support logical replication of DDLs