Re: DROP DATABASE deadlocks with logical replication worker in PG 15.1

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: DROP DATABASE deadlocks with logical replication worker in PG 15.1
Дата
Msg-id 20230117200432.xaoenn7ni7srb2l2@awork3.anarazel.de
обсуждение исходный текст
Ответ на Re: DROP DATABASE deadlocks with logical replication worker in PG 15.1  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: DROP DATABASE deadlocks with logical replication worker in PG 15.1
Список pgsql-bugs
Hi,

On 2023-01-17 06:23:45 +0530, Amit Kapila wrote:
> As per my initial analysis, I have added this code to hold/resume
> interrupts during slot creation due to the test failure (in buildfarm)
> reported in the email [1]. It is clearly a wrong fix as per the report
> and discussion in this thread.

Yea. You really can never hold interrupts across some thing that could
indefinitely be blocked. A HOLD_INTERRUPTS() while doing error recovery (as in
DisableSubscriptionAndExit()) is fine, that's basically a finite amount of
work. But doing so while issuing SQL commands to another node, or anything
else that could just block indefinitely, isn't.


> There is an analysis of the test
> failure in the email [2] which explains the race condition that leads
> to test failure. Thinking again about the failure, I feel we can
> instead change the failed test (t/004_sync.pl) to either ensure that
> both the walsenders (corresponding to sync worker and apply worker)
> exits after dropping the subscription and before checking the
> remaining slots on publisher or wait for slots to become zero in the
> test.

How about waiting for the table to start to be synced (and thus the slot to be
created) before issuing the drop subscription? If the slot hadn't yet been
created, the test doesn't prove that we successfully clean up...

Greetings,

Andres Freund



В списке pgsql-bugs по дате отправления:

Предыдущее
От: "David G. Johnston"
Дата:
Сообщение: Re: Possible wrong result with some "in" subquery with non-existing columns
Следующее
От: Masahiko Sawada
Дата:
Сообщение: Re: BUG #17741: vacuum process hangs after pg_surgery manipulations