Re: [HACKERS] Problem after removal of exec(), help

Поиск
Список
Период
Сортировка
От Goran Thyni
Тема Re: [HACKERS] Problem after removal of exec(), help
Дата
Msg-id 358F8D5E.7EE08AC6@bildbasen.se
обсуждение исходный текст
Ответ на Problem after removal of exec(), help  (Bruce Momjian <maillist@candle.pha.pa.us>)
Ответы Re: [HACKERS] Problem after removal of exec(), help
Список pgsql-hackers
Bruce Momjian wrote:
>
> Since the removal of exec(), Thomas has seen, and I have confirmed that
> if a backend crashes, and the postmaster must reset the shared memory,
> no backends can connect anymore.  One way to reproduce it is to run the
> regression tests, which on their last test will crash for an un-related
> reason.  However, it will not allow you to restart any more backends.
>
> The error it gets is:
>
> Failed Assertion("!((((unsigned long)nextElem) > ShmemBase)):", File: "shmqueue.
> c", Line: 83)
> !((((unsigned long)nextElem) > ShmemBase)) (0) [No such file or directory]
>
> In this case nextElem = ShmemBase, so it is not greater.  Removing the
> Assert() still does not make things work, so there must be something
> else.
>
> Now, the problem is probably not at that exact spot, but somewhere
> deeper.  There are two differences between the old non-exec() behavior
> and new behavior.  In the old setup, the backend had all its global
> variables initialized, while in the new no-exec case, they take the
> global variable values from the postmaster.  Second, the old setup had
> each backend attaching to the shared memory, while the new setup has
> them inheriting the shared memory from the fork().

Bruce,
I have not look into it the specifics yet,
but I suggest looking into what is done when
the child process exits.
This (the pg_exit() et al.) caused some bugs
when we introduced unix domain sockets and
it is not the first place one looks. :-(

    regards,
--
---------------------------------------------
Göran Thyni, sysadm, JMS Bildbasen, Kiruna

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: [HACKERS] Divide by zero error on SPARC/Linux.
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: [HACKERS] Problem after removal of exec(), help