Re: BF animal malleefowl reported an failure in 001_password.pl

Поиск
Список
Период
Сортировка
От Thomas Munro
Тема Re: BF animal malleefowl reported an failure in 001_password.pl
Дата
Msg-id CA+hUKGKykFAoj3Ydyi84aXyQc-mFgPKPadQ2ppsGMqhzcAxDNA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: BF animal malleefowl reported an failure in 001_password.pl  (Thomas Munro <thomas.munro@gmail.com>)
Ответы Re: BF animal malleefowl reported an failure in 001_password.pl  (Thomas Munro <thomas.munro@gmail.com>)
Список pgsql-hackers
On Sun, Jan 15, 2023 at 12:35 AM Thomas Munro <thomas.munro@gmail.com> wrote:
> Here's a sketch of the first idea.

To hit the problem case, the signal needs to arrive in between the
latch->is_set check and the epoll_wait() call, and the handler needs
to take a while to get started.  (If it arrives before the
latch->is_set check we report WL_LATCH_SET immediately, and if it
arrives after the epoll_wait() call begins, we get EINTR and go back
around to the latch->is_set check.)  With some carefully placed sleeps
to simulate a CPU-starved system (see attached) I managed to get a
kill-then-connect sequence to produce:

2023-01-17 10:48:32.508 NZDT [555849] LOG:  nevents = 2
2023-01-17 10:48:32.508 NZDT [555849] LOG:  events[0] = WL_SOCKET_ACCEPT
2023-01-17 10:48:32.508 NZDT [555849] LOG:  events[1] = WL_LATCH_SET
2023-01-17 10:48:32.508 NZDT [555849] LOG:  received SIGHUP, reloading
configuration files

With the patch I posted, we process that in the order we want:

2023-01-17 11:06:31.340 NZDT [562262] LOG:  nevents = 2
2023-01-17 11:06:31.340 NZDT [562262] LOG:  events[1] = WL_LATCH_SET
2023-01-17 11:06:31.340 NZDT [562262] LOG:  received SIGHUP, reloading
configuration files
2023-01-17 11:06:31.344 NZDT [562262] LOG:  events[0] = WL_SOCKET_ACCEPT

Other thoughts:

Another idea would be to teach the latch infrastructure itself to
magically swap latch events to position 0.  Latches are usually
prioritised; it's only in this rare race case that they are not.

Or going the other way, I realise that we're lacking a "wait for
reload" mechanism as discussed in other threads (usually people want
this if they care about its effects on backends other than the
postmaster, where all bets are off and Andres once suggested the
ProcSignalBarrier hammer), and if we ever got something like that it
might be another solution to this particular problem.

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Geoghegan
Дата:
Сообщение: Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation
Следующее
От: Nathan Bossart
Дата:
Сообщение: Re: almost-super-user problems that we haven't fixed yet