Re: Backends dying due to memory exhaustion--I'm stonkered

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Backends dying due to memory exhaustion--I'm stonkered
Дата
Msg-id 13908.980557417@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Backends dying due to memory exhaustion--I'm stonkered  (Doug McNaught <doug@wireboard.com>)
Список pgsql-general
Doug McNaught <doug@wireboard.com> writes:
> I'm running VACUUM, then VACUUM ANALYZE (the docs seem to suggest that
> you need both).  Basically my script is:

VACUUM ANALYZE is a superset of VACUUM; you do not need both.

> The example I sent was a crash during VACUUM.

Hm.  Another perfectly good theory shot to heck ;-).  It seems unlikely
that VACUUM would fail because of corrupted data inside a tuple ...
although corrupted tuple headers could kill it.  Again, though, one
would think such a crash would be repeatable.

> Another thing that springs to mind--once the crash happens, the
> database doesn't respond (or gives fatal errors) to new connections
> and to queries on existing connections.  Killing the postmaster does
> nothing--I have to send SIGTERM to all backends and the postmaster in
> order to get it to exit.  I don't know if this helps...

Now *this* is interesting.  Normally the system recovers quite nicely
from an elog(FATAL), or even from a backend coredump.  I now suspect
something must be getting corrupted in shared memory.  The next time
it happens, would you proceed as follows:
    1. kill -INT the postmaster.
    2. The backends *should* exit in response to the SIGTERM the
       postmaster will have sent them.  Any backend that survives
       more than a fraction of a second is stuck somehow.  For each
       stuck backend, in turn:
    3. kill -ABORT the backend, to create a corefile, and collect
       a gdb backtrace from the corefile.  Be careful to get the
       right corefile, if you are dealing with more than one
       database.

That should give us some idea of what's stuck (especially if you compile
with -g).

BTW, which version did you say you were running?  If it's less than
7.0.3 I'd recommend an update before we pursue this much further ...

            regards, tom lane

В списке pgsql-general по дате отправления:

Предыдущее
От: "Aggarwal , Ajay"
Дата:
Сообщение: 2 or more columns of type 'serial' in a table
Следующее
От: James Thompson
Дата:
Сообщение: GNUe Forms 0.0.5 Released