Re: anole: assorted stability problems

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: anole: assorted stability problems
Дата
Msg-id CA+TgmoaaeRv=1120hQdTjF++Sd4G2zMA-U2-UKzJMD1vMF+CWg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: anole: assorted stability problems  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: anole: assorted stability problems
Список pgsql-hackers
On Sun, Jun 28, 2015 at 9:17 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> That sucks.  It was easy to see that the old fallback barrier
>> implementation wasn't re-entrant, but this one should be.  And now
>> that I look at it again, doesn't the failure message indicate that's
>> not the problem anyway?
>
>> ! PANIC:  stuck spinlock (c00000000d6f4140) detected at lwlock.c:816
>> ! PANIC:  stuck spinlock (c00000000d72f6e0) detected at lwlock.c:770
>
> I was assuming that a leaky memory barrier was allowing the spinlock
> state to become inconsistent, or at least to be perceived as inconsistent.
> But I'm not too clear on how the barrier changes you and Andres have been
> making have affected the spinlock code.

For the most part, they haven't.  Andres did a bunch of work to add
atomics support, and overhauled the barrier implementation that I
committed to 9.2 along the way.  But that had minimal impact on
s_lock.h.

What we did do that touched s_lock.h was attempt to ensure that
SpinLockAcquire() and SpinLockRelease() function as compiler barriers,
so that it should no longer be necessary to litter the code with
"volatile" in every function that uses those.  It is possible that
this could be broken on HP-UX.  If _Asm_sched_fence() doesn't
constraint the compiler appropriately, that could explain the problems
we're seeing here.  But we're not the only one using that incantation,
so I'm left scratching my head.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: drop/truncate table sucks for large values of shared buffers
Следующее
От: Amit Langote
Дата:
Сообщение: Adjust errorcode in background worker code