Re: random backend crashes - how to debug ( Is crash dump handler released ? )

Поиск
Список
Период
Сортировка
От Craig Ringer
Тема Re: random backend crashes - how to debug ( Is crash dump handler released ? )
Дата
Msg-id 4DFDC258.2040006@postnewspapers.com.au
обсуждение исходный текст
Ответ на random backend crashes - how to debug ( Is crash dump handler released ? )  (BangarRaju Vadapalli <BangarRaju.Vadapalli@infor.com>)
Ответы Re: random backend crashes - how to debug ( Is crash dump handler released ? )  (BangarRaju Vadapalli <BangarRaju.Vadapalli@infor.com>)
Список pgsql-general
On 06/14/2011 10:26 PM, BangarRaju Vadapalli wrote:
> Hi Everybody,
>
> We are using PostGRE 8.4 version and experiencing random backend
> crashes. We have enabled logging and are able to see some logging
> happening in pg_log directory but not of much use. Here are the logs.

- Examination of the full length logs sent off-list shows these lines
leading up to the crash:

(crash1): 2011-06-15 13:55:59 IST postgres epimart ERROR:  XX000: could
not open relation base/2850136/3344343_vm: A blocking operation was
interrupted by a call to WSACancelBlockingCall.^M

(crash2) 2011-06-15 14:22:40 IST postgres epimart ERROR:  XX000: could
not open relation base/2850136/3352537_fsm: A blocking operation was
interrupted by a call to WSACancelBlockingCall.^M

... in both cases followed by:

XX000: cannot abort transaction 19859931, it was already committed

then a Windows runtime message reporting that the backend crashed.

Ideas?



After some off-list conversation, we've established that:

- The crash is part of a batch process where the OP loads data from
external sources. It's the 11th stage of a 12 stage process, and cannot
be easily separated into a small self-contained test case. If the OP
runs just the 11th stage standalone, without having just run the prior
stages, the crash does not occur. It only crashes if the whole process
is run in one go.

(OP: please confirm that my summary is accurate, as it's condensed from
several emails).

- The crash is reproducible on 9.0 . It hasn't yet been reproduced on
9.1 because the OP is having some problems with views on 9.1 that he'll
be posting about separately.

- We can't seem to get a crash dump or attach a debugger to get a
backtrace. I built a copy of the early version  of the crash dump
handler before it was integrated into 9.1 so he could load it as a DLL
into 8.4, and it works when the backend is intentionally crashed but
doesn't capture the crash that's causing the problem.

- I still haven't been able to confirm how the batch process in question
works. Does it all run in a single connection with a single transaction?
Or is it a multi-connection affair with multiple scripts / programs
involved? If Bangar Raju could describe this part in more detail that
would be very helpful.

--
Craig Ringer

Вложения

В списке pgsql-general по дате отправления:

Предыдущее
От: Craig Ringer
Дата:
Сообщение: Re: Port forwarding via iptables to postgres listening locally
Следующее
От: Martijn van Oosterhout
Дата:
Сообщение: Re: Referencing function value inside CASE..WHEN