Re: What to do when dynamic shared memory control segment is corrupt

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: What to do when dynamic shared memory control segment is corrupt
Дата
Msg-id 28565.1529339413@sss.pgh.pa.us
обсуждение исходный текст
Ответ на What to do when dynamic shared memory control segment is corrupt  (Sherrylyn Branchaw <sbranchaw@gmail.com>)
Ответы Re: What to do when dynamic shared memory control segment is corrupt
Список pgsql-general
Sherrylyn Branchaw <sbranchaw@gmail.com> writes:
> We are using Postgres 9.6.8 (planning to upgrade to 9.6.9 soon) on RHEL 6.9.
> We recently experienced two similar outages on two different prod
> databases. The error messages from the logs were as follows:
> LOG:  server process (PID 138529) was terminated by signal 6: Aborted

Hm ... were these installations built with --enable-cassert?  If not,
an abort trap seems pretty odd.

> In one case, the logs recorded
> LOG:  all server processes terminated; reinitializing
> LOG:  incomplete data in "postmaster.pid": found only 1 newlines while
> trying to add line 7
> ...

> In the other case, the logs recorded
> LOG:  all server processes terminated; reinitializing
> LOG:  dynamic shared memory control segment is corrupt
> LOG:  incomplete data in "postmaster.pid": found only 1 newlines while
> trying to add line 7
> ...

Those "incomplete data" messages are quite unexpected and disturbing.
I don't know of any mechanism within Postgres proper that would result
in corruption of the postmaster.pid file that way.  (I wondered briefly
if trying to start a conflicting postmaster would result in such a
situation, but experimentation here says not.)  I'm suspicious that
this may indicate a bug or unwarranted assumption in whatever scripts
you use to start/stop the postmaster.  Whether that is at all related
to your crash issue is hard to say, but it bears looking into.

> My question is whether the corrupt shared memory control segment, and the
> failure of Postgres to automatically restart, mean the database should not
> be automatically started up, and if there's something we should be doing
> before restarting.

No, that looks like fairly typical crash recovery to me: corrupt shared
memory contents are expected and recovered from after a crash.  However,
we don't expect postmaster.pid to get mucked with.

            regards, tom lane


В списке pgsql-general по дате отправления:

Предыдущее
От: Łukasz Jarych
Дата:
Сообщение: Run Stored procedure - function from VBA
Следующее
От: Andres Freund
Дата:
Сообщение: Re: What to do when dynamic shared memory control segment is corrupt