deadlock-hard flakiness

Поиск
Список
Период
Сортировка
От Andres Freund
Тема deadlock-hard flakiness
Дата
Msg-id 20230208011021.winlfnypdbzpr3ic@awork3.anarazel.de
обсуждение исходный текст
Ответы Re: deadlock-hard flakiness  (Andres Freund <andres@anarazel.de>)
Re: deadlock-hard flakiness  (Thomas Munro <thomas.munro@gmail.com>)
Re: deadlock-hard flakiness  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
Hi,

On cfbot / CI, we've recently seen a lot of spurious test failures due to
src/test/isolation/specs/deadlock-hard.spec changing output. Always on
freebsd, when running tests against a pre-existing instance.

I'm fairly sure I've seen this failure on the buildfarm as well, but I'm too
impatient to wait for the buildfarm database query (it really should be
updated to use lz4 toast compression).

Example failures:

1)
https://cirrus-ci.com/task/5307793230528512?logs=test_running#L211

https://api.cirrus-ci.com/v1/artifact/task/5307793230528512/testrun/build/testrun/isolation-running/isolation/regression.diffs
https://api.cirrus-ci.com/v1/artifact/task/5307793230528512/testrun/build/testrun/runningcheck.log

2)
https://cirrus-ci.com/task/6137098198056960?logs=test_running#L212

https://api.cirrus-ci.com/v1/artifact/task/6137098198056960/testrun/build/testrun/isolation-running/isolation/regression.diffs
https://api.cirrus-ci.com/v1/artifact/task/6137098198056960/testrun/build/testrun/runningcheck.log

So far the diff always is:

diff -U3 /tmp/cirrus-ci-build/src/test/isolation/expected/deadlock-hard.out
/tmp/cirrus-ci-build/build/testrun/isolation-running/isolation/results/deadlock-hard.out
--- /tmp/cirrus-ci-build/src/test/isolation/expected/deadlock-hard.out    2023-02-07 05:32:34.536429000 +0000
+++ /tmp/cirrus-ci-build/build/testrun/isolation-running/isolation/results/deadlock-hard.out    2023-02-07
05:40:33.833908000+0000
 
@@ -25,10 +25,11 @@
 step s6a7: <... completed>
 step s6c: COMMIT;
 step s5a6: <... completed>
-step s5c: COMMIT;
+step s5c: COMMIT; <waiting ...>
 step s4a5: <... completed>
 step s4c: COMMIT;
 step s3a4: <... completed>
+step s5c: <... completed>
 step s3c: COMMIT;
 step s2a3: <... completed>
 step s2c: COMMIT;


Commit 741d7f1047f fixed a similar issue in deadlock-hard. But it looks like
we need something more. But perhaps this isn't an output ordering issue:

How can we end up with s5c getting reported as waiting? I don't see how s5c
could end up blocking on anything?

Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Stephen Frost
Дата:
Сообщение: Re: RLS makes COPY TO process child tables
Следующее
От: Andres Freund
Дата:
Сообщение: windows CI failing PMSignalState->PMChildFlags[slot] == PM_CHILD_ASSIGNED