Order of operations in SubPostmasterMain()

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Order of operations in SubPostmasterMain()
Дата
Msg-id 4157.1475178360@sss.pgh.pa.us
обсуждение исходный текст
Ответы Re: Order of operations in SubPostmasterMain()  (Greg Stark <stark@mit.edu>)
Re: Order of operations in SubPostmasterMain()  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
I noticed that buildfarm member culicidae, which is running an
EXEC_BACKEND build on Linux, occasionally falls over like this:

FATAL:  could not reattach to shared memory (key=6280001, addr=0x7fa9df845000): Invalid argument

That's probably because Andres failed to disable ASLR on that machine,
but the exact cause isn't too important right now.  What I'm finding
distressing is that that message doesn't show up on the client side,
making it look like a postmaster crash:

*** /home/andres/build/buildfarm-culicidae/HEAD/pgsql.build/../pgsql/contrib/intarray/expected/_int.out    2016-09-15
22:02:39.512951557+0000
 
--- /home/andres/build/buildfarm-culicidae/HEAD/pgsql.build/contrib/intarray/results/_int.out    2016-09-29
17:52:56.921948612+0000
 
***************
*** 1,568 ****
! .... lots ...
--- 1,3 ----
! psql: server closed the connection unexpectedly
!     This probably means the server terminated abnormally
!     before or while processing the request.

The reason we don't see anything is that backend libpq hasn't been
initialized yet, so it won't send the ereport message to the client.

The original design of SubPostmasterMain() intended that such cases
would be reported, because it runs BackendInitialize (which calls
pq_init()) before CreateSharedMemoryAndSemaphores (which is where
the reattach used to happen --- the comments at postmaster.c 4730ff
and 4747ff reflect this expectation).  We broke it when we added
the earlier reattach call at lines 4669ff.

We could probably refactor things enough so that we do pq_init()
before PGSharedMemoryReAttach().  It would be a little bit ugly,
and it would fractionally increase the chance of a reattach failure
because pq_init() palloc's a few KB worth of buffers.  I'm not quite
sure if it's worth it; thoughts?  In any case the mentioned comments
are obsolete and need to be moved/rewritten.

Another thing that I'm finding distressing while I look at this is
the placement of ClosePostmasterPorts().  That needs to happen *early*,
because as long as the child process is holding open the postmaster side
of the postmaster death pipe, we are risking unhappiness.  Not only is
it not particularly early, it's after process_shared_preload_libraries()
which could invoke arbitrarily stupid stuff.  Whether or not we do
anything about moving pq_init(), I'm thinking we'd better move the
ClosePostmasterPorts() call so that it's done as soon as possible,
probably right after read_backend_variables() which loads the data
it needs.

Comments?
        regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Eisentraut
Дата:
Сообщение: Re: pageinspect: Hash index support
Следующее
От: Peter Eisentraut
Дата:
Сообщение: Re: pageinspect: Hash index support