Re: [BUGS] BUG #5305: Postgres service stops when closing Windows session

Поиск
Список
Период
Сортировка
От Bruce Momjian
Тема Re: [BUGS] BUG #5305: Postgres service stops when closing Windows session
Дата
Msg-id 201008241257.o7OCvYt12456@momjian.us
обсуждение исходный текст
Ответ на Re: [BUGS] BUG #5305: Postgres service stops when closing Windows session  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: [BUGS] BUG #5305: Postgres service stops when closing Windows session  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
Robert Haas wrote:
> [moving to -hackers]
> 
> On Thu, Aug 19, 2010 at 9:43 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> > I suspect this is the same problem as bug #4897, and probably also the
> > same problem as this:
> > http://archives.postgresql.org/pgsql-bugs/2009-08/msg00114.php
> >
> > and maybe also this and this:
> > http://archives.postgresql.org/pgsql-bugs/2010-02/msg00179.php
> > http://archives.postgresql.org/pgsql-admin/2009-05/msg00105.php
> >
> > Unfortunately, it seems that no one has been able to get a stack trace yet.
> 
> Bruce pointed out yet another report of this problem to me:
> 
> http://archives.postgresql.org/pgsql-general/2010-08/msg00550.php
> 
> After some discussion with Magnus, I think what is going on here is
> that the postmaster kicks off a new child process, which terminates
> before it actually starts running our code, either in OS-supplied code
> or some sort of "filter" like anti-spam or anti-virus software.  It's
> presumably NOT dying in our code because - at least AFAICS - we don't
> exit(128) anywhere.  One way we could possibly improve the situation
> is to not treat this as a child crash - that is, don't do a
> crash-and-restart cycle; just treat that backend as having done
> elog(FATAL).  The trick is that you need a reliable way to distinguish
> between a regular child crash and an "early" child crash.  Magnus
> suggested perhaps we could create a mutex that the child grabs before
> mapping shared memory; the postmaster could check whether the mutex
> had been taken.  If so, we handle the crash normally; if not, we just
> chalk it up to experience and continue on.
> 
> This isn't really a "fix" for the bug in the sense that the nicest
> thing of all would be to prevent the child from exiting abnormally in
> the first place.  But it's far from clear that we can control that.

This URL has some interesting details on our problem:
http://stackoverflow.com/questions/139090/getexitcodeprocess-returns-128

Error code 128 is identified as:
error code 128 RROR_WAIT_NO_CHILDREN 128 0x80 There are no childprocesses to wait for

and the suggested cause is:
Have a look at Desktop Heap memory.Essentially the desktop heap issue comes down to exhausted resources (egstarting too
manyprocesses). When your app runs out of these resources,one of the symptoms is that you won't be able to start a new
process,andthe call to CreateProcess will fail with code 128.
 

My guess is that at the time of CreateProcess(), there is enough desktop
heap memory, but at some later time, perhaps caused by a logout, there
isn't and the process never gets started.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Magnus Hagander
Дата:
Сообщение: Re: Fw: patch for pg_ctl.c to add windows service start-type
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: Return of the Solaris vacuum polling problem -- anyone remember this?