Re: problems on Solaris

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: problems on Solaris
Дата
Msg-id 20150527225528.GP5310@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: problems on Solaris  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: problems on Solaris  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On 2015-05-27 15:39:14 -0400, Robert Haas wrote:
> On Mon, May 25, 2015 at 10:05 PM, Andres Freund <andres@anarazel.de> wrote:
> > Hm. So we have a *occasional* stack size exceeded failure and an
> > occasional spinlock error in test_shm_mq. I'm inclined to think that
> > this is a shm_mq problem, and not a more general locking problem - it
> > seems likely, but not guaranteed, that that'd have materialized
> > elsewhere.
>
> I think the problem might be that the spinlock-based memory barrier is
> not re-entrant.  Suppose some kind of barrier operation is in process,
> and we've acquired the dummy spnlock but not yet released it.  Just
> then, we receive a signal.  Since the shm_mq code sets
> set_latch_on_sigusr1, procsignal_sigusr1_handler will set MyLatch.
> SetLatch now includes barrier operations, so we'll try to acquire and
> release the spinlock despite already holding it.  Oops.

Oh wow, that's bad, and could explain a couple of the problems we're
seing. One possible way to fix is to replace the sequence with if
(!TAS(spin)) S_UNLOCK();. But that'd mean TAS() has to be a barrier,
even if the lock isn't free - which e.g. isn't the case for PowerPC's
implementation :(



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Steve Kehlet
Дата:
Сообщение: Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1
Следующее
От: David Fetter
Дата:
Сообщение: GENERATED: the new generation