Обсуждение: [PATCH] Race condition in logical walsender causes long postgresqlshutdown delay

Поиск
Список
Период
Сортировка

[PATCH] Race condition in logical walsender causes long postgresqlshutdown delay

От
Craig Ringer
Дата:
Hi folks

I recently tracked down a race in shutdown of logical walsenders that can cause PostgreSQL shutdown to hang for wal_sender_timeout/2 before it continues to a normal shutdown. With a long timeout that can be quite disruptive.

TL;DR: The logical walsender may be signalled to stop, then read the last WAL record before the shutdown checkpoint is due to be written and go to sleep. The checkpointer will wait for it to acknowledge the shutdown and the walsender will wait for new WAL. The deadlock is eventually broken by the walsender timeout keepalive timer.

Patch attached.

The issue arises from the difference between logical and physical walsender shutdown as introduced by commit c6c3334364 "Prevent possibility of panics during shutdown checkpoint". It's probably fairly hard to trigger. I ran into a case where it happened regularly only because of an unrelated patch that caused some WAL to be written just before the checkpointer issued walsender shutdown requests. But it's still a legit bug.

If you hit the issue you'll see that walsender(s) can be seen to be sleeping in WaitLatchOrSocket in WalSndLoop. They'll keep sleeping until woken by the keepalive timeout. The checkpointer will be waiting in WalSndWaitStopping() for the walsenders to enter WALSNDSTATE_STOPPING or exit, whichever happens first. The postmaster will be waiting in ServerLoop for the checkpointer to finish the shutdown checkpoint.

The checkpointer waits in WalSndWaitStopping() for all walsenders to either exit or enter WALSNDSTATE_STOPPING state. Logical walsenders never enter WALSNDSTATE_STOPPING, they go straight to exiting, so the checkpointer can't finish WalSndWaitStopping() and write the shutdown checkpoint. A logical walsender usually notices the shutdown request and exits as soon as it has flushed all WAL up to the server's flushpoint, while physical walsenders enter WALSNDSTATE_STOPPING.
    
But there's a race where a logical walsender may read the final available record and notice it has caught up - but not notice that it has reached end-of-WAL and check whether it should exit. This happens on the following (simplified) code path in XLogSendLogical:
   
        if (record != NULL)
        {
            XLogRecPtr      flushPtr = GetFlushRecPtr();
            LogicalDecodingProcessRecord(...);
            sentPtr = ...;
            if (sentPtr >= flushPtr)
                WalSndCaughtUp = true;    // <-- HERE
        }
    
because the test for got_STOPPING that sets got_SIGUSR2 is only on the other branch where getting a record returns `NULL`; this branch can sleep before checking if shutdown was requested.
   
So if the walsender read the last WAL record available, when it's >= the flush pointer and it already handled the SIGUSR1 latch wakeup for the WAL write, it might go back to sleep and not wake up until the timeout.
   
The checkpointer already sent PROCSIG_WALSND_INIT_STOPPING to the walsenders in the prior WalSndInitStopping() call so the walsender won't be woken by a signal from the checkpointer. No new WAL will be written because the walsender just consumed the final record written before the checkpointer went to sleep, and the checkpointer won't write anything more until the walsender exits. The client might not be due a keepalive for some time.The only reason this doesn't turn into a total deadlock is that keepalive wakeup.

An alternative fix would be to have the logical walsender set WALSNDSTATE_STOPPING instead of faking got_SIGUSR2, then go to sleep waiting for more WAL. Logical decoding would need to check if it was running during shutdown and Assert(...) then ERROR if it saw any WAL records that result in output plugin calls or snapshot management calls. I avoided this approach as it's more intrusive and I'm not confident I can concoct a reliable test to trigger it.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 2ndQuadrant - PostgreSQL Solutions for the Enterprise
Вложения

Re: [PATCH] Race condition in logical walsender causes longpostgresql shutdown delay

От
Alvaro Herrera
Дата:
On 2019-Jul-25, Craig Ringer wrote:

> Patch attached.

Here's a non-broken version of this patch.  I have not done anything
other than reflowing the new comment.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

Re: [PATCH] Race condition in logical walsender causes longpostgresql shutdown delay

От
Alvaro Herrera
Дата:
On 2019-Sep-03, Alvaro Herrera wrote:

> On 2019-Jul-25, Craig Ringer wrote:
> 
> > Patch attached.
> 
> Here's a non-broken version of this patch.  I have not done anything
> other than reflowing the new comment.

Reading over this code, I noticed that the detection of the catch-up
state ends up being duplicate code, so I would rework that function as
in the attached patch.

The naming of those flags (got_SIGUSR2, got_STOPPING) is terrible, but
I'm not going to change that in a backpatchable bug fix.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

Re: [PATCH] Race condition in logical walsender causes longpostgresql shutdown delay

От
Jeff Janes
Дата:
On Wed, Sep 11, 2019 at 3:52 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

Reading over this code, I noticed that the detection of the catch-up
state ends up being duplicate code, so I would rework that function as
in the attached patch.

The naming of those flags (got_SIGUSR2, got_STOPPING) is terrible, but
I'm not going to change that in a backpatchable bug fix.

Hi Alvaro, does this count as a review?  And Craig, do you agree with Alvaro's patch as a replacement for your own?
 
Thanks,

Jeff

Re: [PATCH] Race condition in logical walsender causes longpostgresql shutdown delay

От
Alvaro Herrera
Дата:
On 2019-Sep-26, Jeff Janes wrote:

> On Wed, Sep 11, 2019 at 3:52 PM Alvaro Herrera <alvherre@2ndquadrant.com>
> wrote:
> 
> > Reading over this code, I noticed that the detection of the catch-up
> > state ends up being duplicate code, so I would rework that function as
> > in the attached patch.
> >
> > The naming of those flags (got_SIGUSR2, got_STOPPING) is terrible, but
> > I'm not going to change that in a backpatchable bug fix.
> 
> Hi Alvaro, does this count as a review?

Well, I'm already a second pair of eyes for Craig's code, so I think it
does :-)  I would have liked confirmation from Craig that my change
looks okay to him too, but maybe we'll have to go without that.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [PATCH] Race condition in logical walsender causes longpostgresql shutdown delay

От
Alvaro Herrera
Дата:
On 2019-Sep-26, Alvaro Herrera wrote:

> On 2019-Sep-26, Jeff Janes wrote:

> > Hi Alvaro, does this count as a review?
> 
> Well, I'm already a second pair of eyes for Craig's code, so I think it
> does :-)  I would have liked confirmation from Craig that my change
> looks okay to him too, but maybe we'll have to go without that.

There not being a third pair of eyes, I have pushed this.

Thanks!

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [PATCH] Race condition in logical walsender causes longpostgresql shutdown delay

От
Craig Ringer
Дата:



On Thu, 17 Oct 2019 at 21:19, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
On 2019-Sep-26, Alvaro Herrera wrote:

> On 2019-Sep-26, Jeff Janes wrote:

> > Hi Alvaro, does this count as a review?
>
> Well, I'm already a second pair of eyes for Craig's code, so I think it
> does :-)  I would have liked confirmation from Craig that my change
> looks okay to him too, but maybe we'll have to go without that.

There not being a third pair of eyes, I have pushed this.

Thanks!


Thanks.

I'm struggling to keep up with my own threads right now... 


--
 Craig Ringer                   http://www.2ndQuadrant.com/
 2ndQuadrant - PostgreSQL Solutions for the Enterprise