Re: FATAL: stuck spinlock

Поиск
Список
Период
Сортировка
От Jakub Ouhrabka
Тема Re: FATAL: stuck spinlock
Дата
Msg-id Pine.LNX.4.33.0205012258550.2649-100000@u-pl1.ms.mff.cuni.cz
обсуждение исходный текст
Ответ на FATAL: stuck spinlock  (Jakub Ouhrabka <jouh8664@ss1000.ms.mff.cuni.cz>)
Список pgsql-general
> > FATAL: s_lock(0x40361030) at lwlock.c:236, stuck spinlock. Aborting.
>
> Oh?  That shouldn't happen.  More details please?

I don't know what is interesting for you but I'll try...

There is one central database and 10 other databases, let's call them
applications' databases. There are 4 daemons which are receiving messages
from outside, inserting them into the central database then they look
(select) if there are messages for them to send, if so, then send it and
update the sent row. All 4 daemons are are working with two same tables:
one for received messages and one for messages to be sent, there is one
fileds in both tables which determininies for/from which daemon is that
row... These daemons are working in infinite loops: try to receive max 100
messages from outside and then try to send max 100, so there are no
notifications, triggers, etc...
And then, there are 10 other daemons, one for each application database
which are doing nearly the same thing: look into central database for
received messages for the application, insert them into application
database, look into the application database for messages to be sent,
insert them into the central databse... Also infinite loop...
When that error happend there was very low traffic, no concurrent or
nearly concurrent messages - in fact near that time, more precisly after
that error message (after daemons reconnection to the databses) there was
one message received...
Because of concurrent access to tables in the central database I had to
remove all foreign key constraints in that database - with foreign key
constraints there was very often deadlock detected... I know that this a
known issue...

I was running this setup but with only 5-6 applications for months without
problems even in big traffic... Exactly this setup was running for aprox.
10 days without problems.

Is it possible that this not postgres issue but hardware issue? Recently I
have problems with memory on this server but it has been changed. May be
something else is also damaged... This server is athlon with 2.4.16,
postgres 7.2.1 from debian package. There is nothing else running on this
server.

Is there anything else you would like to know?

thanks,        kuba



В списке pgsql-general по дате отправления:

Предыдущее
От: Anna Dorofiyenko
Дата:
Сообщение: Re: rowcount
Следующее
От: "Nigel J. Andrews"
Дата:
Сообщение: Re: rowcount