Re: reindex and copy - deadlock?

От: Tom Lane
Тема: Re: reindex and copy - deadlock?
Дата: ,
Msg-id: 12457.1086969920@sss.pgh.pa.us
(см: обсуждение, исходный текст)
Ответ на: Re: reindex and copy - deadlock?  (Litao Wu)
Ответы: Re: reindex and copy - deadlock?  (Litao Wu)
Список: pgsql-performance

Скрыть дерево обсуждения

Join slow on "large" tables  (Josué Maldonado, )
 Re: Join slow on "large" tables  (Josh Berkus, )
  Re: Join slow on "large" tables  (Josué Maldonado, )
   Re: Join slow on "large" tables  (Josh Berkus, )
    Re: Join slow on "large" tables  (Josué Maldonado, )
   Re: Join slow on "large" tables  (Tom Lane, )
 Re: Join slow on "large" tables  ("Scott Marlowe", )
  Re: Join slow on "large" tables  (Josué Maldonado, )
   reindex and copy - deadlock?  (Litao Wu, )
    Re: reindex and copy - deadlock?  (Tom Lane, )
     Re: reindex and copy - deadlock?  (Litao Wu, )
      Re: reindex and copy - deadlock?  (Tom Lane, )
       Re: reindex and copy - deadlock?  (Litao Wu, )
        Re: reindex and copy - deadlock?  (Tom Lane, )
         Re: reindex and copy - deadlock?  (Litao Wu, )
          Re: reindex and copy - deadlock?  (Tom Lane, )
           Re: reindex and copy - deadlock?  (Litao Wu, )
            Re: reindex and copy - deadlock?  (Tom Lane, )
           Re: reindex and copy - deadlock?  (Litao Wu, )
            Re: reindex and copy - deadlock?  (Tom Lane, )
             Re: reindex and copy - deadlock?  (Litao Wu, )
              Re: reindex and copy - deadlock?  (Tom Lane, )
               Re: reindex and copy - deadlock?  (Litao Wu, )
                Re: reindex and copy - deadlock?  (Tom Lane, )
                 network address query  (Litao Wu, )
   Re: Join slow on "large" tables  ("Scott Marlowe", )

Litao Wu <> writes:
> (gdb) bt
> #0  0x420e8bb2 in semop () from /lib/i686/libc.so.6
> #1  0x080ffa28 in PGSemaphoreLock ()
> #2  0x08116432 in LWLockAcquire ()
> #3  0x0810f572 in LockBuffer ()
> #4  0x0807dea3 in _bt_getbuf ()
> #5  0x080813ec in _bt_leafbuild ()
> #6  0x080816a6 in _bt_leafbuild ()
> #7  0x08081b8b in _bt_leafbuild ()
> #8  0x080813cc in _bt_leafbuild ()
> #9  0x0807e1d0 in btbuild ()
> #10 0x081631c3 in OidFunctionCall3 ()
> #11 0x080920a7 in index_build ()
> #12 0x08092593 in reindex_index ()

Hmm.  I don't think I believe this backtrace.  It's obviously wrong at
lines 5-7 - _bt_leafbuild doesn't call itself nor call _bt_getbuf.
It's possible that you don't have any local symbols in this executable
and what we're seeing is the nearest global symbol, so let's ignore
that; but if we take lines 0-4 at face value, what it says is that the
REINDEX is stuck waiting for buffer lock on a buffer for a new empty
page it has just added to the new index.  This is flatly impossible.
There is no other process that could possibly be interested in that
buffer, or for that matter even be able to name it (since the new index
has a new relfilenode value that isn't even visible to any other process
yet).  I thought for a little bit that a background CHECKPOINT might be
trying to write out the new buffer, but that theory holds no water
either, because at this point in the _bt_getbuf sequence, the buffer is
not marked dirty (I just verified this by stepping through it in 7.4.2).

I can think of lots of reasons why the REINDEX might block at the
previous step of the sequence, namely acquiring a fresh buffer ... but
once it's got the buffer there is surely no reason to block.

What I'm inclined to think is that the backtrace isn't right at all.
Would it be possible for you to install a backend built with
--enable-debug and get a more reliable backtrace?

            regards, tom lane


В списке pgsql-performance по дате сообщения:

От: Tom Lane
Дата:
Сообщение: Re: in-transaction insert performance in 7.5devel
От: Vitaly Belman
Дата:
Сообщение: Additional select fields in a GROUP BY