8.4-vintage problem in postmaster.c

Поиск
Список
Период
Сортировка
От Alvaro Herrera
Тема 8.4-vintage problem in postmaster.c
Дата
Msg-id 1289575843-sup-9048@alvh.no-ip.org
обсуждение исходный текст
Ответы Re: 8.4-vintage problem in postmaster.c  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Hi,

Stefan Kaltenbrunner reported a problem in postmaster via IM to me.  I
thought I had nailed down the bug, but after more careful reading of the
code, turns out I was wrong.

The reported problem is that postmaster shuts itself down with this
error message:

2010-11-12 10:19:05 CET FATAL:  no free slots in PMChildFlags array

I thought that canAcceptConnections() was confused about what
the result of CountChildren() meant, but apparently not.

This is a change from the 8.3 code that didn't have the ChildSlots
stuff -- previously, if canAcceptConnections failed to report
CAC_TOOMANY, it would just fail later when trying to add the backend to
the shared-inval queue, as stated in the comment therein.  In the new
code, however, failure to keep an accurate count means that we fail
later in AssigPostmasterChildSlot with a FATAL error, leading to overall
shutdown.

In postmaster.c, this all happens before forking, so I see no way for
the system to be confused due to multiple processes starting in
parallel.


If you suspect that this may have to do with some race condition on
starting many backends quickly, you would probably be right.  The
evidence from the log (which thankfully is set to DEBUG3, though most
other settings about it seem to be rather broken) says that there were
many backend starting just before the FATAL message:

2010-11-12 10:18:55 CET DEBUG:  forked new backend, pid=2632 socket=348
2010-11-12 10:18:55 CET DEBUG:  forked new backend, pid=840 socket=348
2010-11-12 10:18:55 CET DEBUG:  forked new backend, pid=2972 socket=348
2010-11-12 10:18:55 CET DEBUG:  forked new backend, pid=2724 socket=348
2010-11-12 10:18:57 CET DEBUG:  forked new backend, pid=840 socket=348
2010-11-12 10:18:57 CET DEBUG:  forked new backend, pid=2724 socket=348
2010-11-12 10:18:57 CET DEBUG:  forked new backend, pid=2632 socket=348
2010-11-12 10:19:00 CET DEBUG:  forked new backend, pid=2724 socket=348
2010-11-12 10:19:01 CET DEBUG:  forked new backend, pid=2972 socket=348
2010-11-12 10:19:01 CET DEBUG:  forked new backend, pid=2724 socket=348
2010-11-12 10:19:02 CET DEBUG:  forked new backend, pid=2984 socket=348
2010-11-12 10:19:02 CET DEBUG:  forked new backend, pid=840 socket=348
2010-11-12 10:19:04 CET DEBUG:  forked new backend, pid=2984 socket=348
2010-11-12 10:19:04 CET DEBUG:  forked new backend, pid=840 socket=348
2010-11-12 10:19:04 CET DEBUG:  forked new backend, pid=2984 socket=348
2010-11-12 10:19:04 CET DEBUG:  forked new backend, pid=2972 socket=348
2010-11-12 10:19:04 CET DEBUG:  forked new backend, pid=840 socket=348
2010-11-12 10:19:04 CET DEBUG:  forked new backend, pid=2724 socket=348
2010-11-12 10:19:04 CET DEBUG:  forked new backend, pid=2972 socket=348
2010-11-12 10:19:04 CET DEBUG:  forked new backend, pid=2904 socket=348
2010-11-12 10:19:04 CET DEBUG:  forked new backend, pid=840 socket=348
2010-11-12 10:19:04 CET DEBUG:  forked new backend, pid=1280 socket=348
2010-11-12 10:19:04 CET DEBUG:  forked new backend, pid=2984 socket=348
2010-11-12 10:19:04 CET DEBUG:  forked new backend, pid=2904 socket=348
2010-11-12 10:19:04 CET DEBUG:  forked new backend, pid=840 socket=348
2010-11-12 10:19:05 CET DEBUG:  forked new backend, pid=2724 socket=348

This is Windows 2000 Server --- I guess the PIDs being reused rather
quickly is not something to worry particularly about.  (Also note that
log_line_prefix does not include the PID so it's not easy to learn much
more from the log, according to Stefan).

-- 
Álvaro Herrera <alvherre@alvh.no-ip.org>


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Restructuring plancache.c API
Следующее
От: Robert Haas
Дата:
Сообщение: Re: WIP: extensible enums