Re: Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease
Дата
Msg-id 20140211130757.GE31598@awork2.anarazel.de
обсуждение исходный текст
Ответ на Re: Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease  ("MauMau" <maumau307@gmail.com>)
Ответы Re: Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease  ("MauMau" <maumau307@gmail.com>)
Список pgsql-hackers
On 2014-02-11 21:46:04 +0900, MauMau wrote:
> From: "Andres Freund" <andres@2ndquadrant.com>
> >which means they manipulate the lwWaitLink queue without
> >protection. That's done intentionally. The code tries to protect against
> >corruption of the list to do a woken up backend acquiring a lock (this
> >or an independent one) by only continuing when the lwWaiting flag is set
> >to false. Unfortunately there's absolutely no guarantee that a) the
> >assignment to lwWaitLink and lwWaiting are done in that order b) that
> >the stores are done in-order from the POV of other backends.
> >So what we need to do is to acquire a write barrier between the
> >assignments to lwWaitLink and lwWaiting, i.e.
> >       proc->lwWaitLink = NULL;
> >       pg_write_barrier();
> >       proc->lwWaiting = false;
> >the reader side already uses an implicit barrier by using spinlocks.
> 
> I've got a report from one customer that they encountered a hang during
> performance benchmarking.  They were using PostgreSQL 9.2.4.  I remember
> that the stack trace showed many backends blocked forever at LWLockAcuuire()
> during btree insert operation.  I'm not sure this has something to do with
> what you are raising, but the release notes for 9.2.5/6 doesn't suggest any
> fixes for this.  So I felt there is something wrong with lwlocks.
> 
> Do you think that your question could cause my customer's problem --
> backends block at lwlock forever?

It's x86, right? Then it's unlikely to be actual unordered memory
accesses, but if the compiler reordered:   LOG_LWDEBUG("LWLockRelease", T_NAME(l), T_ID(l), "release waiter");   proc =
head;  head = proc->lwWaitLink;   proc->lwWaitLink = NULL;   proc->lwWaiting = false;   PGSemaphoreUnlock(&proc->sem);
 
to   LOG_LWDEBUG("LWLockRelease", T_NAME(l), T_ID(l), "release waiter");   proc = head;   proc->lwWaiting = false;
head= proc->lwWaitLink;   proc->lwWaitLink = NULL;   PGSemaphoreUnlock(&proc->sem);
 
which it is permitted to do, yes, that could cause symptoms like you
describe.

Any chance you have the binaries the customer ran back then around?
Disassembling that piece of code might give you a hint whether that's a
possible cause.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: "MauMau"
Дата:
Сообщение: Re: Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease
Следующее
От: Robert Haas
Дата:
Сообщение: Re: Patch: show xid and xmin in pg_stat_activity and pg_stat_replication