Re: walsender "wakeup storm" on PG16, likely because of bc971f4025c (Optimize walsender wake up logic using condition variables)

Поиск
Список
Период
Сортировка
От Thomas Munro
Тема Re: walsender "wakeup storm" on PG16, likely because of bc971f4025c (Optimize walsender wake up logic using condition variables)
Дата
Msg-id CA+hUKGLG--QGeCNd4+ur4eXnWcM3TD0ap1yH4akeBNHwg+CCSA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: walsender "wakeup storm" on PG16, likely because of bc971f4025c (Optimize walsender wake up logic using condition variables)  (Andres Freund <andres@anarazel.de>)
Ответы Re: walsender "wakeup storm" on PG16, likely because of bc971f4025c (Optimize walsender wake up logic using condition variables)  (Andres Freund <andres@anarazel.de>)
Re: walsender "wakeup storm" on PG16, likely because of bc971f4025c (Optimize walsender wake up logic using condition variables)  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Список pgsql-hackers
On Sat, Aug 12, 2023 at 5:51 AM Andres Freund <andres@anarazel.de> wrote:
> On 2023-08-11 15:31:43 +0200, Tomas Vondra wrote:
> > It seems to me the issue is in WalSndWait, which was reworked to use
> > ConditionVariableCancelSleep() in bc971f4025c. The walsenders end up
> > waking each other in a busy loop, until the timing changes just enough
> > to break the cycle.
>
> IMO ConditionVariableCancelSleep()'s behaviour of waking up additional
> processes can nearly be considered a bug, at least when combined with
> ConditionVariableBroadcast(). In that case the wakeup is never needed, and it
> can cause situations like this, where condition variables basically
> deteriorate to a busy loop.
>
> I hit this with AIO as well. I've been "solving" it by adding a
> ConditionVariableCancelSleepEx(), which has a only_broadcasts argument.
>
> I'm inclined to think that any code that needs that needs the forwarding
> behaviour is pretty much buggy.

Oh, I see what's happening.  Maybe commit b91dd9de wasn't the best
idea, but bc971f4025c broke an assumption, since it doesn't use
ConditionVariableSleep().  That is confusing the signal forwarding
logic which expects to find our entry in the wait list in the common
case.

What do you think about this patch?

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: David Zhang
Дата:
Сообщение: Re: [PATCH] psql: Add tab-complete for optional view parameters
Следующее
От: Peter Geoghegan
Дата:
Сообщение: Re: AssertLog instead of Assert in some places