Обсуждение: SR standby hangs

Поиск
Список
Период
Сортировка

SR standby hangs

От
Andrew Dunstan
Дата:
PostgreSQL Experts Inc has a client with a 9.0.2 streaming replication 
server that somehow becomes wedged after running for some time.

The server is running as a warm standby, and the client's application 
tries to connect to both the master and the slave, accepting whichever 
lets it connect (hence hot standby is not turned on).

Archive files are being shipped as well as WAL streaming.

The symptom is that the recovery process blocks forever on a semaphore. 
We've crashed it and got the following backtrace:
   #0  0x0000003493ed5337 in semop () from /lib64/libc.so.6   #1  0x00000000005bd103 in PGSemaphoreLock
(sema=0x2b14986aec38,interruptOK=1   '\001') at pg_sema.c:420   #2  0x00000000005de645 in LockBufferForCleanup () at
bufmgr.c:2432  #3  0x0000000000463733 in heap_xlog_clean (lsn=<value optimized out>,   record=0x1787e1c0) at
heapam.c:4168  #4  heap2_redo (lsn=<value optimized out>, record=0x1787e1c0) at heapam.c:4858   #5  0x0000000000488780
inStartupXLOG () at xlog.c:6250   #6  0x000000000048a888 in StartupProcessMain () at xlog.c:9254   #7
0x00000000004a11efin AuxiliaryProcessMain (argc=2, argv=<value optimized   out>) at bootstrap.c:412   #8
0x00000000005c66c9in StartChildProcess (type=StartupProcess) at   postmaster.c:4427   #9  0x00000000005c8ab7 in
PostmasterMain(argc=1, argv=0x17858bb0) at   postmaster.c:1088   #10 0x00000000005725fe in main (argc=1, argv=<value
optimizedout>) at main.c:188
 


The platform is CentOS 5.5 x86-64, kernel version 2.6.18-194.11.4.el5

I'm not quite sure where to start digging. Has anyone else seen 
something similar? Our consultant reports having seen a similar problem 
elsewhere, at a client who was running hot standby on 9.0.1, but the 
problem did not recur, as it does fairly regularly with this client.

cheers

andrew



Re: SR standby hangs

От
Greg Stark
Дата:
On Fri, Feb 18, 2011 at 6:59 PM, Andrew Dunstan <amdunstan@nc.rr.com> wrote:
> The server is running as a warm standby, and the client's application tries
> to connect to both the master and the slave, accepting whichever lets it
> connect (hence hot standby is not turned on).
>...
>   #2  0x00000000005de645 in LockBufferForCleanup () at bufmgr.c:2432
>   #3  0x0000000000463733 in heap_xlog_clean (lsn=<value optimized out>,

Hm, if hot standby isn't turned on why are we bothering with locks at
all? But that said is there *anything* else happening in the database
aside from recovery? Are there any errors in the database log?

But this still shouldn't block. It's either blocked locking the buffer
or blocked waiting for the buffer to become unpinned. It would be nice
to get a backtrace from a debugging build which wouldn't have some of
the functions inlined. It would be really nice to see the pin count on
the buffer in question -- perhaps it has gotten out of sync or
underflowed?

Hmm.... I wonder, does Tom's boot strapping hack to allow the
postmaster to use the real user and database tables cause pages in
those tables to become pinned? What happens if it plays a cleanup
recovery record on one of those tables while someone is trying to log
in?


--
greg


Re: SR standby hangs

От
Robert Haas
Дата:
On Sun, Feb 20, 2011 at 12:39 PM, Greg Stark <gsstark@mit.edu> wrote:
> On Fri, Feb 18, 2011 at 6:59 PM, Andrew Dunstan <amdunstan@nc.rr.com> wrote:
>> The server is running as a warm standby, and the client's application tries
>> to connect to both the master and the slave, accepting whichever lets it
>> connect (hence hot standby is not turned on).
>>...
>>   #2  0x00000000005de645 in LockBufferForCleanup () at bufmgr.c:2432
>>   #3  0x0000000000463733 in heap_xlog_clean (lsn=<value optimized out>,
>
> Hm, if hot standby isn't turned on why are we bothering with locks at
> all? But that said is there *anything* else happening in the database
> aside from recovery? Are there any errors in the database log?
>
> But this still shouldn't block. It's either blocked locking the buffer
> or blocked waiting for the buffer to become unpinned. It would be nice
> to get a backtrace from a debugging build which wouldn't have some of
> the functions inlined. It would be really nice to see the pin count on
> the buffer in question -- perhaps it has gotten out of sync or
> underflowed?

I realize now that it would also be useful to see the state of the
LWLock on the buffer.

A little OT, but ISTM that the buffer pin mechanism by its nature is
prone to lock upgrade hazards.  A cleanup lock is essentially an
access exclusive lock on the buffer, while a buffer pin is an access
share lock.  In the middle, we have the usual share and exclusive
(content) locks.  We regularly try to upgrade buffer pins to any of
the higher locking levels, which is quite unlike what we do with
regular (heavyweight) locks, where we take some pains to avoid lock
upgrades.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: SR standby hangs

От
Greg Stark
Дата:
On Tue, Feb 22, 2011 at 12:55 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> A little OT, but ISTM that the buffer pin mechanism by its nature is
> prone to lock upgrade hazards.  A cleanup lock is essentially an
> access exclusive lock on the buffer, while a buffer pin is an access
> share lock.  In the middle, we have the usual share and exclusive
> (content) locks.  We regularly try to upgrade buffer pins to any of
> the higher locking levels, which is quite unlike what we do with
> regular (heavyweight) locks, where we take some pains to avoid lock
> upgrades.

Except that pins don't block exclusive locks so there's no deadlock risk.

The oddity here is on Vacuums super-exclusive "lock" which is the real
equivalent of an "exclusive lock". However there's the added bonus
that there can only be one vacuum on a table at a time. That makes it
safe

It might be interesting to have autovacuum skip a block it finds
pinned for too long. Since it's willing to abort entirely if someone
gets a conflicting lock skipping a block doesn't seem like a big deal.
It won't let us advance the frozenxid but it'll speed up a subsequent
vacuum.

At some point I wonder if we'll find ourselves wanting to keep track
of frozenxid per block and keep a kind of priority queue of blocks to
vacuum based on their age. I continue to wonder whether we're
reinventing standard garbage collection processes.

Incidentally, even if we allowed multiple vacuum processes per table I
think it could be coded to be safe as long as each vacuum only needs
to acquire the super exclusive lock on a single block at a time and
doesn't try to acquire other locks while holding it.



--
greg


Re: SR standby hangs

От
Tom Lane
Дата:
Greg Stark <gsstark@mit.edu> writes:
> On Tue, Feb 22, 2011 at 12:55 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> A little OT, but ISTM that the buffer pin mechanism by its nature is
>> prone to lock upgrade hazards.

> Except that pins don't block exclusive locks so there's no deadlock risk.

> The oddity here is on Vacuums super-exclusive "lock" which is the real
> equivalent of an "exclusive lock". However there's the added bonus
> that there can only be one vacuum on a table at a time. That makes it
> safe

We have seen deadlocks arising from this type of scenario:
autovac has vacuum lock on table Xautovac blocks waiting for cleanup lock on buffer B in Xprocess P has pin on B due to
asuspended query (eg cursor)P tries to get exclusive lock on X, is blocked by autovac's lock
 

The heavyweight-lock manager fails to recognize deadlock because it
doesn't know about the buffer-level LWLock.

> It might be interesting to have autovacuum skip a block it finds
> pinned for too long.

+1, although as somebody pointed out nearby, this will only be legal if
it's not a vacuum-to-prevent-wraparound situation.

> Incidentally, even if we allowed multiple vacuum processes per table I
> think it could be coded to be safe as long as each vacuum only needs
> to acquire the super exclusive lock on a single block at a time and
> doesn't try to acquire other locks while holding it.

IIRC, it's cleaning the indexes that is problematic.
        regards, tom lane


Re: SR standby hangs

От
Robert Haas
Дата:
On Tue, Feb 22, 2011 at 11:34 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Greg Stark <gsstark@mit.edu> writes:
>> On Tue, Feb 22, 2011 at 12:55 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>> A little OT, but ISTM that the buffer pin mechanism by its nature is
>>> prone to lock upgrade hazards.
>
>> Except that pins don't block exclusive locks so there's no deadlock risk.
>
>> The oddity here is on Vacuums super-exclusive "lock" which is the real
>> equivalent of an "exclusive lock". However there's the added bonus
>> that there can only be one vacuum on a table at a time. That makes it
>> safe
>
> We have seen deadlocks arising from this type of scenario:
>
>        autovac has vacuum lock on table X
>        autovac blocks waiting for cleanup lock on buffer B in X
>        process P has pin on B due to a suspended query (eg cursor)
>        P tries to get exclusive lock on X, is blocked by autovac's lock
>
> The heavyweight-lock manager fails to recognize deadlock because it
> doesn't know about the buffer-level LWLock.
>
>> It might be interesting to have autovacuum skip a block it finds
>> pinned for too long.
>
> +1, although as somebody pointed out nearby, this will only be legal if
> it's not a vacuum-to-prevent-wraparound situation.

Another approach to this problem would be to jigger things so that the
query doesn't hold a buffer pin while suspended.  I'm not quite sure
how to make that work, but maybe it's possible.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: SR standby hangs

От
Andrew Dunstan
Дата:

On 02/22/2011 07:55 AM, Robert Haas wrote:
> On Sun, Feb 20, 2011 at 12:39 PM, Greg Stark<gsstark@mit.edu>  wrote:
>> On Fri, Feb 18, 2011 at 6:59 PM, Andrew Dunstan<amdunstan@nc.rr.com>  wrote:
>>> The server is running as a warm standby, and the client's application tries
>>> to connect to both the master and the slave, accepting whichever lets it
>>> connect (hence hot standby is not turned on).
>>> ...
>>>    #2  0x00000000005de645 in LockBufferForCleanup () at bufmgr.c:2432
>>>    #3  0x0000000000463733 in heap_xlog_clean (lsn=<value optimized out>,
>> Hm, if hot standby isn't turned on why are we bothering with locks at
>> all? But that said is there *anything* else happening in the database
>> aside from recovery? Are there any errors in the database log?
>>
>> But this still shouldn't block. It's either blocked locking the buffer
>> or blocked waiting for the buffer to become unpinned. It would be nice
>> to get a backtrace from a debugging build which wouldn't have some of
>> the functions inlined. It would be really nice to see the pin count on
>> the buffer in question -- perhaps it has gotten out of sync or
>> underflowed?
> I realize now that it would also be useful to see the state of the
> LWLock on the buffer.


This has happened again. This time we have some debug info available, 
and can possible get more, if people tell me what will be helpful:
   (gdb) bt   #0  0x0000003493ed5337 in semop () from /lib64/libc.so.6   #1  0x00000000005bd1e3 in PGSemaphoreLock
(sema=0x2b1a6014ac38,  interruptOK=1 '\001') at pg_sema.c:420   #2  0x00000000005de735 in LockBufferForCleanup
(buffer=310163)at   bufmgr.c:2432   #3  0x000000000048b896 in RestoreBkpBlocks (lsn=<value optimized   out>,
record=0x2b1a6191b010,cleanup=1 '\001') at xlog.c:3540   #4  0x0000000000463788 in heap_xlog_clean (lsn=<value
optimized  out>, record=0x2b1a6191b010) at heapam.c:4160   #5  heap2_redo (lsn=<value optimized out>,
record=0x2b1a6191b010)at   heapam.c:4858   (gdb) f 2   #2  0x00000000005de735 in LockBufferForCleanup (buffer=310163)
at  bufmgr.c:2432   2432                ProcWaitForSignal();   (gdb) p *bufHdr   $2 = {tag = {rnode = {spcNode = 16393,
dbNode= 40475, relNode =   41880}, forkNum = MAIN_FORKNUM, blockNum = 18913}, flags = 6,   usage_count = 1, refcount =
1,wait_backend_pid = 9111,      buf_hdr_lock = 0 '\000', buf_id = 310162, freeNext = -2,   io_in_progress_lock =
620448,content_lock = 620449}
 

Strangely, I don't even see ProcWaitForSignal() in the frame list - I 
shouldn't have thought it was a candidate to be optimised away.

cheers

andrew



Re: SR standby hangs

От
Tom Lane
Дата:
Andrew Dunstan <andrew@dunslane.net> writes:
> This has happened again. This time we have some debug info available, 
> and can possible get more, if people tell me what will be helpful:

>     (gdb) f 2
>     #2  0x00000000005de735 in LockBufferForCleanup (buffer=310163) at
>     bufmgr.c:2432
>     2432                ProcWaitForSignal();
>     (gdb) p *bufHdr
>     $2 = {tag = {rnode = {spcNode = 16393, dbNode = 40475, relNode =
>     41880}, forkNum = MAIN_FORKNUM, blockNum = 18913}, flags = 6,
>     usage_count = 1, refcount = 1, wait_backend_pid = 9111,
>        buf_hdr_lock = 0 '\000', buf_id = 310162, freeNext = -2,
>     io_in_progress_lock = 620448, content_lock = 620449}

Well, that's pretty interesting: refcount is only 1, and the
BM_PIN_COUNT_WAITER flag is not set.  AFAICS this *must* mean that the
buffer had been pinned and whoever had it (presumably bgwriter) did
UnpinBuffer().  So it appears that the signal just plain got lost :-(,
which suggests a kernel bug.  What platform is this on, again?

> Strangely, I don't even see ProcWaitForSignal() in the frame list - I 
> shouldn't have thought it was a candidate to be optimised away.

It has a tail call to PGSemaphoreLock, so not totally surprising.
        regards, tom lane


Re: SR standby hangs

От
Andrew Dunstan
Дата:

On 04/26/2011 04:28 PM, Tom Lane wrote:
> Andrew Dunstan<andrew@dunslane.net>  writes:
>> This has happened again. This time we have some debug info available,
>> and can possible get more, if people tell me what will be helpful:
>>      (gdb) f 2
>>      #2  0x00000000005de735 in LockBufferForCleanup (buffer=310163) at
>>      bufmgr.c:2432
>>      2432                ProcWaitForSignal();
>>      (gdb) p *bufHdr
>>      $2 = {tag = {rnode = {spcNode = 16393, dbNode = 40475, relNode =
>>      41880}, forkNum = MAIN_FORKNUM, blockNum = 18913}, flags = 6,
>>      usage_count = 1, refcount = 1, wait_backend_pid = 9111,
>>         buf_hdr_lock = 0 '\000', buf_id = 310162, freeNext = -2,
>>      io_in_progress_lock = 620448, content_lock = 620449}
> Well, that's pretty interesting: refcount is only 1, and the
> BM_PIN_COUNT_WAITER flag is not set.

I noticed that.

> AFAICS this *must* mean that the
> buffer had been pinned and whoever had it (presumably bgwriter) did
> UnpinBuffer().  So it appears that the signal just plain got lost :-(,
> which suggests a kernel bug.  What platform is this on, again?



CentOS 5.5, x86_64, kernel 2.6.18-194.32.1.el5

cheers

andrew





Re: SR standby hangs

От
Tom Lane
Дата:
I wrote:
> Well, that's pretty interesting: refcount is only 1, and the
> BM_PIN_COUNT_WAITER flag is not set.  AFAICS this *must* mean that the
> buffer had been pinned and whoever had it (presumably bgwriter) did
> UnpinBuffer().  So it appears that the signal just plain got lost :-(,
> which suggests a kernel bug.  What platform is this on, again?

BTW, could you confirm that the startup process's PID is 9111 as the
bufhdr suggests?  Also, it'd be good to confirm that
procglobal->startupProcPid and procglobal->startupProc point to the
startup process.  I notice that ProcSendSignal will silently do nothing
if it doesn't find the target process's PGPROC, which might have
something to do with this ...
        regards, tom lane


Re: SR standby hangs

От
Andrew Dunstan
Дата:

On 04/26/2011 04:45 PM, Tom Lane wrote:
> I wrote:
>> Well, that's pretty interesting: refcount is only 1, and the
>> BM_PIN_COUNT_WAITER flag is not set.  AFAICS this *must* mean that the
>> buffer had been pinned and whoever had it (presumably bgwriter) did
>> UnpinBuffer().  So it appears that the signal just plain got lost :-(,
>> which suggests a kernel bug.  What platform is this on, again?
> BTW, could you confirm that the startup process's PID is 9111 as the
> bufhdr suggests?

postgres  9111  9105  0 Apr02 ?        00:37:38 postgres: startup 
process   recovering 000000040000036D00000033

> Also, it'd be good to confirm that
> procglobal->startupProcPid and procglobal->startupProc point to the
> startup process.  I notice that ProcSendSignal will silently do nothing
> if it doesn't find the target process's PGPROC, which might have
> something to do with this ...

Not sure if I'm looking in the right spot, but this looks suspicious:

(gdb) p 'postmaster.c'::ProcGlobal->startupProcPid
$1 = 0
(gdb) p 'postmaster.c'::ProcGlobal->startupProc
$2 = (PGPROC *) 0x0


cheers

andrew




Re: SR standby hangs

От
Tom Lane
Дата:
Andrew Dunstan <andrew@dunslane.net> writes:
> (gdb) p 'postmaster.c'::ProcGlobal->startupProcPid
> $1 = 0
> (gdb) p 'postmaster.c'::ProcGlobal->startupProc
> $2 = (PGPROC *) 0x0

Oh ... you need this patch:

http://git.postgresql.org/gitweb?p=postgresql.git&a=commitdiff&h=9bb1ddec4cd998c5fbac278a54d8ad5a5011e4e1

Update to 9.0.4 ...
        regards, tom lane


Re: SR standby hangs

От
Andrew Dunstan
Дата:

On 04/26/2011 05:31 PM, Tom Lane wrote:
> Andrew Dunstan<andrew@dunslane.net>  writes:
>> (gdb) p 'postmaster.c'::ProcGlobal->startupProcPid
>> $1 = 0
>> (gdb) p 'postmaster.c'::ProcGlobal->startupProc
>> $2 = (PGPROC *) 0x0
> Oh ... you need this patch:
>
> http://git.postgresql.org/gitweb?p=postgresql.git&a=commitdiff&h=9bb1ddec4cd998c5fbac278a54d8ad5a5011e4e1
>
> Update to 9.0.4 ...
>
>             

Oh, I missed that one. Thanks for the diagnosis!

cheers

andrew