Re: Why is src/test/modules/committs/t/002_standby.pl flaky?

Поиск
Список
Период
Сортировка
От Alexander Lakhin
Тема Re: Why is src/test/modules/committs/t/002_standby.pl flaky?
Дата
Msg-id 68e3771a-2842-14bb-b115-bbd9702b7cbd@gmail.com
обсуждение исходный текст
Ответ на Re: Why is src/test/modules/committs/t/002_standby.pl flaky?  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
Hello Andres,
15.01.2022 01:44, Andres Freund wrote:
> What about instead giving WalReceiverConn an internal WaitEventSet, and using
> that consistently? I've attached a draft for that.
>
> Alexander, could you test with that patch applied?
Unfortunately, this patch doesn't fix the issue. The test
commit_ts/002_standby still fails (on iterations 3, 1, 8 for me).
With the debugging logging added:
...
elog(LOG, "libpqrcv_wait() before WaitEventSetWait(%p)", conn->wes);
    nevents = WaitEventSetWait(conn->wes, timeout, &event, 1,
wait_event_info);
elog(LOG, "libpqrcv_wait() after WaitEventSetWait");
...
elog(LOG, "WaitEventSetWaitBlock before WaitForMultipleObjects(%p)",
set->handles);
    rc = WaitForMultipleObjects(set->nevents + 1, set->handles, FALSE,
                                cur_timeout);
elog(LOG, "WaitEventSetWaitBlock aftet WaitForMultipleObjects");
...
and so on.

I see in 002_standby_standby.log
...
2022-01-16 13:31:36.244 MSK [1336] LOG:  libpqrcv_receive PQgetCopyData
returned: 145
2022-01-16 13:31:36.244 MSK [1336] LOG:  libpqrcv_receive PQgetCopyData
returned: 0
2022-01-16 13:31:36.244 MSK [1336] LOG:  libpqrcv_wait() before
WaitEventSetWait(000000000063ABA8)
2022-01-16 13:31:36.244 MSK [1336] LOG:  WaitEventSetWait before
WaitEventSetWaitBlock(000000000063ABA8)
2022-01-16 13:31:36.244 MSK [1336] LOG:  WaitEventSetWaitBlock before
WaitForMultipleObjects(000000000063AC30)
2022-01-16 13:31:36.244 MSK [2820] LOG:  WaitEventSetWaitBlock aftet
WaitForMultipleObjects
2022-01-16 13:31:36.244 MSK [2820] LOG:  WaitEventSetWait after
WaitEventSetWaitBlock
2022-01-16 13:31:36.244 MSK [1336] LOG:  WaitEventSetWaitBlock aftet
WaitForMultipleObjects
2022-01-16 13:31:36.244 MSK [1336] LOG:  WaitEventSetWait after
WaitEventSetWaitBlock
2022-01-16 13:31:36.244 MSK [1336] LOG:  libpqrcv_wait() after
WaitEventSetWait
2022-01-16 13:31:36.244 MSK [1336] LOG:  libpqrcv_receive PQgetCopyData
returned: 0
2022-01-16 13:31:36.244 MSK [1336] LOG:  libpqrcv_wait() before
WaitEventSetWait(000000000063ABA8)
2022-01-16 13:31:36.244 MSK [1336] LOG:  WaitEventSetWait before
WaitEventSetWaitBlock(000000000063ABA8)
2022-01-16 13:31:36.244 MSK [1336] LOG:  WaitEventSetWaitBlock before
WaitForMultipleObjects(000000000063AC30)
2022-01-16 13:31:36.244 MSK [2820] LOG:  WaitEventSetWait before
WaitEventSetWaitBlock(0000000000649FB8)
2022-01-16 13:31:36.244 MSK [2820] LOG:  WaitEventSetWaitBlock before
WaitForMultipleObjects(000000000064A020)
2022-01-16 13:31:36.247 MSK [1336] LOG:  WaitEventSetWaitBlock aftet
WaitForMultipleObjects
2022-01-16 13:31:36.247 MSK [1336] LOG:  WaitEventSetWait after
WaitEventSetWaitBlock
2022-01-16 13:31:36.247 MSK [1336] LOG:  libpqrcv_wait() after
WaitEventSetWait
2022-01-16 13:31:36.247 MSK [1336] LOG:  libpqrcv_receive PQgetCopyData
returned: -1
2022-01-16 13:31:36.247 MSK [1336] LOG:  libpqrcv_receive before
libpqrcv_PQgetResult(1)
2022-01-16 13:31:36.247 MSK [1336] LOG:  libpqrcv_receive
libpqrcv_PQgetResult(1) returned 0000000000692400
2022-01-16 13:31:36.247 MSK [1336] LOG:  libpqrcv_receive before
libpqrcv_PQgetResult(2)
2022-01-16 13:31:36.247 MSK [1336] LOG:  libpqrcv_wait() before
WaitEventSetWait(000000000063ABA8)
2022-01-16 13:31:36.247 MSK [1336] LOG:  WaitEventSetWait before
WaitEventSetWaitBlock(000000000063ABA8)
2022-01-16 13:31:36.247 MSK [1336] LOG:  WaitEventSetWaitBlock before
WaitForMultipleObjects(000000000063AC30)
2022-01-16 13:31:36.368 MSK [984] LOG:  WaitEventSetWaitBlock aftet
WaitForMultipleObjects
2022-01-16 13:31:36.368 MSK [984] LOG:  WaitEventSetWait after
WaitEventSetWaitBlock
...
After that, the process 1336 hangs till shutdown.

Best regards,
Alexander



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Julien Rouhaud
Дата:
Сообщение: Re: Isn't wait_for_catchup slightly broken?
Следующее
От: Magnus Hagander
Дата:
Сообщение: Re: pg_basebackup WAL streamer shutdown is bogus - leading to slow tests