Re: Quite strange crash

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Quite strange crash
Дата
Msg-id 18163.978974498@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Quite strange crash  (Denis Perchine <dyp@perchine.com>)
Ответы Re: Quite strange crash  (ncm@zembu.com (Nathan Myers))
Re: Quite strange crash  (Denis Perchine <dyp@perchine.com>)
Список pgsql-hackers
Denis Perchine <dyp@perchine.com> writes:
>>>>>>> FATAL: s_lock(401f7435) at bufmgr.c:2350, stuck spinlock. Aborting.
>>>>> 
>>>>> Were there any errors before that?

> Actually you can have a look on the logs yourself.

Well, I found a smoking gun:

Jan  7 04:27:51 mx postgres[2501]: FATAL 1:  The system is shutting down

PID 2501 had been running:

Jan  7 04:25:44 mx postgres[2501]: query: vacuum verbose lazy;

What seems to have happened is that 2501 curled up and died, leaving
one or more buffer spinlocks locked.  Roughly one spinlock timeout
later, at 04:29:07, we have 1008 complaining of a stuck spinlock.
So that fits.

The real question is what happened to 2501?  None of the other backends
reported a SIGTERM signal, so the signal did not come from the
postmaster.

Another interesting datapoint: there is a second place in this logfile
where one single backend reports SIGTERM while its brethren keep running:

Jan  7 04:30:47 mx postgres[4269]: query: vacuum verbose;
...
Jan  7 04:38:16 mx postgres[4269]: FATAL 1:  The system is shutting down

There is something pretty fishy about this.  You aren't by any chance
running the postmaster under a ulimit setting that might cut off
individual backends after a certain amount of CPU time, are you?
What signal does a ulimit violation deliver on your machine, anyway?
        regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Ross J. Reedstrom"
Дата:
Сообщение: Re: bootstrap tables
Следующее
От: Jan Wieck
Дата:
Сообщение: Re: is_view seems unnecessarily slow