On 7 May 2011 18:07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> The aspect of this that *is* relevant is that if you haven't
> deliberately defeated the interlock (and thereby put your data at risk),
> you won't be able to start a new postmaster until all the old
> shmem-attached children are gone. And that's why having a child with a
> very long reaction time for parent death represents a denial of service.
Alright. I don't suppose it would be acceptable to have the startup
process signal any auxiliary process that it might find with init as a
parent through ps, and within the handler for that signal in each
auxiliary (I suppose it's a SIGUSR2), take appropriate action,
typically just waking up through a SetLatch() call once we
independently verify that we are in fact orphaned?
If we find orphans, we could perform a "nap and check" loop within the
startup process (probably tighter than 1 second per iteration), until
the shmem-attached children that are liable to block us from starting
a new postmaster exit().
--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services