Обсуждение: SIGQUIT on archiver child processes maybe not such a hot idea?
My buildfarm animal dromedary ran out of disk space yesterday, which
I found rather surprising because the last time I'd looked it had
tens of GB to spare.  On investigation, the problem was lots and lots
of core images in /cores, which is where macOS drops them (by default
at least).  It looked like I was getting one new core image per
buildfarm run, even successful runs.  Even odder, they mostly seemed
to be images from /bin/cp, not Postgres.
After investigation, the mechanism that's causing that is that the
src/test/recovery/t/010_logical_decoding_timelines.pl test shuts
down its replica server with a mode-immediate stop, which causes
that postmaster to shut down all its children with SIGQUIT, and
in particular that signal propagates to a "cp" command that the
archiver process is executing.  The "cp" is unsurprisingly running
with default SIGQUIT handling, which per the signal man page
includes dumping core.
This makes me wonder whether we shouldn't be using some other signal
to shut down archiver subprocesses.  It's not real cool if we're
spewing cores all over the place.  Admittedly, production servers
are likely running with "ulimit -c 0" on most modern platforms,
so this might not be a huge problem in the field; but accumulation
of core files could be a problem anywhere that's configured to allow
server core dumps.
I suspect the reason we've not noticed this in the buildfarm is that most
of those platforms are configured to dump core into the data directory,
where it'll be thrown away when we clean up after the run.  But aside
from macOS, the machines running recent systemd releases are likely
accumulating cores somewhere behind the scenes, since systemd has seen
fit to insert itself into core-handling along with everything else.
Ideally, perhaps, we'd be using SIGINT not SIGQUIT to shut down
non-Postgres child processes.  But redesigning the system's signal
handling to make that possible seems like a bit of a mess.
Thoughts?
            regards, tom lane
			
		From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > After investigation, the mechanism that's causing that is that the > src/test/recovery/t/010_logical_decoding_timelines.pl test shuts > down its replica server with a mode-immediate stop, which causes > that postmaster to shut down all its children with SIGQUIT, and > in particular that signal propagates to a "cp" command that the > archiver process is executing. The "cp" is unsurprisingly running > with default SIGQUIT handling, which per the signal man page > includes dumping core. We've experienced this (core dump in the data directory by an archive command) years ago. Related to this, the example ofusing cp in the PostgreSQL manual is misleading, because cp doesn't reliably persist the WAL archive file. > This makes me wonder whether we shouldn't be using some other signal > to shut down archiver subprocesses. It's not real cool if we're > spewing cores all over the place. Admittedly, production servers > are likely running with "ulimit -c 0" on most modern platforms, > so this might not be a huge problem in the field; but accumulation > of core files could be a problem anywhere that's configured to allow > server core dumps. We enable the core dump in production to help the investigation just in case. > Ideally, perhaps, we'd be using SIGINT not SIGQUIT to shut down > non-Postgres child processes. But redesigning the system's signal > handling to make that possible seems like a bit of a mess. > > Thoughts? We're using a shell script and a command that's called in the shell script. That is: archive_command = 'call some_shell_script.sh ...' [some_shell_script.sh] ulimit -c 0 trap SIGQUIT to just exit on the receipt of the signal call some_command to copy file some_command also catches SIGQUIT just exit. It copies and syncs the file. I proposed something in this line as below, but I couldn't respond to Peter's review comments due to other tasks. Does anyonethink it's worth resuming this? https://www.postgresql.org/message-id/7E37040CF3804EA5B018D7A022822984@maumau Regards Takayuki Tsunakawa
On Mon, Sep 02, 2019 at 12:27:09AM +0000, Tsunakawa, Takayuki wrote: > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] >> After investigation, the mechanism that's causing that is that the >> src/test/recovery/t/010_logical_decoding_timelines.pl test shuts >> down its replica server with a mode-immediate stop, which causes >> that postmaster to shut down all its children with SIGQUIT, and >> in particular that signal propagates to a "cp" command that the >> archiver process is executing. The "cp" is unsurprisingly running >> with default SIGQUIT handling, which per the signal man page >> includes dumping core. > > We've experienced this (core dump in the data directory by an > archive command) years ago. Related to this, the example of using > cp in the PostgreSQL manual is misleading, because cp doesn't > reliably persist the WAL archive file. The previous talks about having pg_copy are still where they were a couple of years ago as we did not agree on which semantics it should have. If we could move forward with that and update the documentation from its insanity that would be great and... The signal handling is something else we could customize in a more favorable way with the archiver. Anyway, switching from something else than SIGQUIT to stop the archiver will not prevent any other tools from generating core dumps with this other signal. > We enable the core dump in production to help the investigation just in case. So do I in some of the stuff I work on. > some_command also catches SIGQUIT just exit. It copies and syncs the file. > > I proposed something in this line as below, but I couldn't respond to Peter's review comments due to other tasks. Doesanyone think it's worth resuming this? > > https://www.postgresql.org/message-id/7E37040CF3804EA5B018D7A022822984@maumau And I was looking for this thread a couple of lines ago :) Thanks. -- Michael
Вложения
At Mon, 2 Sep 2019 15:51:34 +0900, Michael Paquier <michael@paquier.xyz> wrote in <20190902065134.GE1841@paquier.xyz> > On Mon, Sep 02, 2019 at 12:27:09AM +0000, Tsunakawa, Takayuki wrote: > > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > >> After investigation, the mechanism that's causing that is that the > >> src/test/recovery/t/010_logical_decoding_timelines.pl test shuts > >> down its replica server with a mode-immediate stop, which causes > >> that postmaster to shut down all its children with SIGQUIT, and > >> in particular that signal propagates to a "cp" command that the > >> archiver process is executing. The "cp" is unsurprisingly running > >> with default SIGQUIT handling, which per the signal man page > >> includes dumping core. > > > > We've experienced this (core dump in the data directory by an > > archive command) years ago. Related to this, the example of using > > cp in the PostgreSQL manual is misleading, because cp doesn't > > reliably persist the WAL archive file. > > The previous talks about having pg_copy are still where they were a > couple of years ago as we did not agree on which semantics it should > have. If we could move forward with that and update the documentation > from its insanity that would be great and... The signal handling is > something else we could customize in a more favorable way with the > archiver. Anyway, switching from something else than SIGQUIT to stop > the archiver will not prevent any other tools from generating core > dumps with this other signal. Since we are allowing OPs to use arbitrary command as archive_command, providing a replacement with non-standard signal handling for a specific command doesn't seem a general solution to me. Couldn't we have pg_system(a tentative name), which intercepts SIGQUIT then sends SIGINT to children? Might be need to resend SIGQUIT after some interval, though.. > > We enable the core dump in production to help the investigation just in case. > > So do I in some of the stuff I work on. > > > some_command also catches SIGQUIT just exit. It copies and syncs the file. > > > > I proposed something in this line as below, but I couldn't respond to Peter's review comments due to other tasks. Doesanyone think it's worth resuming this? > > > > https://www.postgresql.org/message-id/7E37040CF3804EA5B018D7A022822984@maumau > > And I was looking for this thread a couple of lines ago :) > Thanks. # Is there any means to view the whole of a thread from archive? # I'm a kind of reluctant to wander among messages like a rat in # a maze:p regards. -- Kyotaro Horiguchi NTT Open Source Software Center
From: Kyotaro Horiguchi [mailto:horikyota.ntt@gmail.com] > Since we are allowing OPs to use arbitrary command as > archive_command, providing a replacement with non-standard signal > handling for a specific command doesn't seem a general solution > to me. Couldn't we have pg_system(a tentative name), which > intercepts SIGQUIT then sends SIGINT to children? Might be need > to resend SIGQUIT after some interval, though.. The same idea that you referred to as pg_system occurred to me, too. But I wondered if the archiver process can get thepid of its child (shell? archive_command?), while keeping the capabilities of system() (= the shell). Even if we fork()and then system(), doesn't the OS send SIGQUIT to any descendents of the archiver when postmaster sends SIGQUIT tothe child process group? > # Is there any means to view the whole of a thread from archive? > # I'm a kind of reluctant to wander among messages like a rat in > # a maze:p You can see the whole thread by clicking the "Whole Thread" link. Regards Takayuki Tsunakawa
On Tue, Sep 3, 2019 at 2:43 PM Tsunakawa, Takayuki
<tsunakawa.takay@jp.fujitsu.com> wrote:
> From: Kyotaro Horiguchi [mailto:horikyota.ntt@gmail.com]
> > Since we are allowing OPs to use arbitrary command as
> > archive_command, providing a replacement with non-standard signal
> > handling for a specific command doesn't seem a general solution
> > to me. Couldn't we have pg_system(a tentative name), which
> > intercepts SIGQUIT then sends SIGINT to children? Might be need
> > to resend SIGQUIT after some interval, though..
>
> The same idea that you referred to as pg_system occurred to me, too.  But I wondered if the archiver process can get
thepid of its child (shell? archive_command?), while keeping the capabilities of system() (= the shell).  Even if we
fork()and then system(), doesn't the OS send SIGQUIT to any descendents of the archiver when postmaster sends SIGQUIT
tothe child process group? 
So, to recap what's happening here, we have a tree of processes like this:
postmaster
-> archiver
   -> sh
      -> cp [user-supplied archiving command]
The archiver is a process group leader, because it called setsid().
The postmaster's signal_child() does kill(pid, ...) and also
kill(-pid, ...), so the kernel sends SIGQUIT to archiver (twice), sh
and cp.  As for what they do with the signal, it depends on timing:
1.  The archiver normally exits immediately in pgarch_exit(), but
while system() is running, SIGQUIT and SIGINT are ignored (see POSIX).
2.  sh normally uses SIGINT to break out of loops etc, but while it's
waiting for a subprocess, it also ignores SIGQUIT and SIGINT (see
POSIX).
3.  cp inherits the default disposition and (unless it handles it
specially) dumps core.
I think the general idea here is that interactive shells and similar
things want to ignore signals from users typing ^C (SIGINT) or ^\
(SIGQUIT) so they can affect  just the thing that's actually running
at this moment, not the tree of processes waiting.
Yeah, I guess we could have our own pg_system() function that does
roughly what system() does, namely fork(), then execl() in the child
and waitpid() in the parent, but the child could begin a new process
group with setsid() before running execl() (so that it no longer gets
SIGQUIT with the postmaster signals the archiver), and the parent
could record pg_system_child_pid when forking, and install a QUIT
handler that does kill(-pg_system_child_pid, SIGTERM), as well as
setting a flag that will cause its main loop to exit (but not before
it has run waitpid()).  With some carefully placed blocks and unblocks
and ignores, to avoid races.
That all sounds like a lot of work though, and it might be easier to
just make an exception and use SIGTERM to shut down the archiver, as I
think Tom was suggesting.  Unfortunately we have the same problem
elsewhere, where we use popen().  I just wrote a C program that does
just "sleep(60)", ran it with COPY FROM PROGRAM, then sent SIGQUIT to
the postmaster, and got a dumped core.
--
Thomas Munro
https://enterprisedb.com
			
		At Tue, 3 Sep 2019 02:42:19 +0000, "Tsunakawa, Takayuki" <tsunakawa.takay@jp.fujitsu.com> wrote in <0A3221C70F24FB45833433255569204D1FD1609C@G01JPEXMBYT05> > From: Kyotaro Horiguchi [mailto:horikyota.ntt@gmail.com] > > Since we are allowing OPs to use arbitrary command as > > archive_command, providing a replacement with non-standard signal > > handling for a specific command doesn't seem a general solution > > to me. Couldn't we have pg_system(a tentative name), which > > intercepts SIGQUIT then sends SIGINT to children? Might be need > > to resend SIGQUIT after some interval, though.. > > The same idea that you referred to as pg_system occurred to me, > too. But I wondered if the archiver process can get the pid of > its child (shell? archive_command?), while keeping the > capabilities of system() (= the shell). Even if we fork() and > then system(), doesn't the OS send SIGQUIT to any descendents I imagnined that we don't use system(3) at all. It would be almost simplified version of system(3), but doesn't use it. > of the archiver when postmaster sends SIGQUIT to the child > process group? Mmmm.. That's bad. The commit 3ad0728c81 intends to send SIGQUIT to the programs under system() for archiver process. I found the discusion for the patch. https://www.postgresql.org/message-id/flat/24849.1163826231%40sss.pgh.pa.us#c987fc18a4b9113c6276ae8b5d85abba > Reimplementing system() seems pretty ugly, but maybe we have no choice. > It strikes me that system() has a race condition as defined anyway, > because if a signal arrives between blocking the handler and issuing the > fork(), it'll disappear into the ether; and the same at the end of the > routine. .. >> Alternatively perhaps Postgres really ought to be using USR1/USR2 or other >> signals that library routines won't think they have any business rearranging. > > The existing signal assignments were all picked for what seem to me > to be good reasons; I'm disinclined to change them. In any case, the > important point here is that we'd really like an archive or recovery > script, or for that matter any command executed via system() from a > backend, to abort when the parent backend is SIGINT'd or SIGQUIT'd. But now we know that sending it to grand-children is wrong in a sense that that leads to left-alone unwanted core files. But the behavior is already knwon at the time. So, Now I know that we need to revert that in certain extent if we want to stop the core-dumping behavior... I'm inclined to think that we should just document that behavior.. > > # Is there any means to view the whole of a thread from archive? > > # I'm a kind of reluctant to wander among messages like a rat in > > # a maze:p > > You can see the whole thread by clicking the "Whole Thread" link. What??? Oops!!! I found it among "Whole Mssage", "Download mbox" and "Resend email". Sorry for my slipperly eyes and thanks! regrds. -- Kyotaro Horiguchi NTT Open Source Software Center
Kyotaro Horiguchi <horikyota.ntt@gmail.com> writes:
> But now we know that sending it to grand-children is wrong in a
> sense that that leads to left-alone unwanted core files. But the
> behavior is already knwon at the time.
> So, Now I know that we need to revert that in certain extent if
> we want to stop the core-dumping behavior...
Yeah.  After thinking about this more, I'm inclined to propose that
we just change what the postmaster does, as per attached patch.
A couple of questions arise:
* Would it be better to substitute SIGTERM instead of SIGINT?
The POSIX default handling is the same for both, but some programs
might interpret them differently.
* With this patch, our own processes would see SIGQUIT then
SIGINT (or SIGTERM).  Would any of them misbehave?  I think not
(and this patch does pass check-world) but it might be a good
idea to double-check.
            regards, tom lane
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 62dc93d..e4736a1 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -3956,6 +3956,10 @@ PostmasterStateMachine(void)
  * archive_recovery script, or SIGINT a script being run by a backend via
  * system().
  *
+ * SIGQUIT is a special case, in that the POSIX-spec default handling for
+ * it includes dumping core, which we don't really want for non-Postgres
+ * processes.  Hence, we send SIGINT not SIGQUIT to the process group.
+ *
  * There is a race condition for recently-forked children: they might not
  * have executed setsid() yet.  So we signal the child directly as well as
  * the group.  We assume such a child will handle the signal before trying
@@ -3972,12 +3976,15 @@ signal_child(pid_t pid, int signal)
     {
         case SIGINT:
         case SIGTERM:
-        case SIGQUIT:
         case SIGSTOP:
         case SIGKILL:
             if (kill(-pid, signal) < 0)
                 elog(DEBUG3, "kill(%ld,%d) failed: %m", (long) (-pid), signal);
             break;
+        case SIGQUIT:
+            if (kill(-pid, SIGINT) < 0)
+                elog(DEBUG3, "kill(%ld,%d) failed: %m", (long) (-pid), SIGINT);
+            break;
         default:
             break;
     }
			
		On 9/3/19 12:04 PM, Tom Lane wrote: > Kyotaro Horiguchi <horikyota.ntt@gmail.com> writes: >> But now we know that sending it to grand-children is wrong in a >> sense that that leads to left-alone unwanted core files. But the >> behavior is already knwon at the time. > >> So, Now I know that we need to revert that in certain extent if >> we want to stop the core-dumping behavior... > > Yeah. After thinking about this more, I'm inclined to propose that > we just change what the postmaster does, as per attached patch. > > A couple of questions arise: > > * Would it be better to substitute SIGTERM instead of SIGINT? > The POSIX default handling is the same for both, but some programs > might interpret them differently. I prefer SIGTERM, but FWIW pgBackRest handles them both the same way. -- -David david@pgmasters.net
Greetings, * Michael Paquier (michael@paquier.xyz) wrote: > On Mon, Sep 02, 2019 at 12:27:09AM +0000, Tsunakawa, Takayuki wrote: > > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > >> After investigation, the mechanism that's causing that is that the > >> src/test/recovery/t/010_logical_decoding_timelines.pl test shuts > >> down its replica server with a mode-immediate stop, which causes > >> that postmaster to shut down all its children with SIGQUIT, and > >> in particular that signal propagates to a "cp" command that the > >> archiver process is executing. The "cp" is unsurprisingly running > >> with default SIGQUIT handling, which per the signal man page > >> includes dumping core. > > > > We've experienced this (core dump in the data directory by an > > archive command) years ago. Related to this, the example of using > > cp in the PostgreSQL manual is misleading, because cp doesn't > > reliably persist the WAL archive file. > > The previous talks about having pg_copy are still where they were a > couple of years ago as we did not agree on which semantics it should > have. If we could move forward with that and update the documentation > from its insanity that would be great and... The signal handling is > something else we could customize in a more favorable way with the > archiver. Anyway, switching from something else than SIGQUIT to stop > the archiver will not prevent any other tools from generating core > dumps with this other signal. Any tools being used for archive command (which should basically be things designed to be used as such and certainly not cp...) should be prepared to handle what PG ends up doing here. I don't think we should change to a different signal because it'll make 'cp' do something different. If there's a good reason to use a different signal, great. In other words, I think I agree with Tom that maybe we should be using SIGINT here, but not so much because of exactly what cp does but rather because that's a more appropriate signal, as shown by what the default handling for those signals is. Thanks, Stephen
Вложения
Greetings, * David Steele (david@pgmasters.net) wrote: > On 9/3/19 12:04 PM, Tom Lane wrote: > > Kyotaro Horiguchi <horikyota.ntt@gmail.com> writes: > >> But now we know that sending it to grand-children is wrong in a > >> sense that that leads to left-alone unwanted core files. But the > >> behavior is already knwon at the time. > > > >> So, Now I know that we need to revert that in certain extent if > >> we want to stop the core-dumping behavior... > > > > Yeah. After thinking about this more, I'm inclined to propose that > > we just change what the postmaster does, as per attached patch. > > > > A couple of questions arise: > > > > * Would it be better to substitute SIGTERM instead of SIGINT? > > The POSIX default handling is the same for both, but some programs > > might interpret them differently. > > I prefer SIGTERM, but FWIW pgBackRest handles them both the same way. Yeah, I wondered about that too, perhaps SIGTERM is better. I'm not really particular either way. Thanks, Stephen
Вложения
Stephen Frost <sfrost@snowman.net> writes:
> * David Steele (david@pgmasters.net) wrote:
>> On 9/3/19 12:04 PM, Tom Lane wrote:
>>> * Would it be better to substitute SIGTERM instead of SIGINT?
>>> The POSIX default handling is the same for both, but some programs
>>> might interpret them differently.
>> I prefer SIGTERM, but FWIW pgBackRest handles them both the same way.
> Yeah, I wondered about that too, perhaps SIGTERM is better.  I'm not
> really particular either way.
Yeah, after thinking about it, SIGTERM is probably what to use.
It's less likely that some random program would think it should
maybe not terminate in response.  Catching SIGINT is pretty common
among "interactive" programs, and while there doesn't seem an
obvious reason to be using something like that below a PG process,
you never know.
I looked around for potential problems arising from my other
point about our own processes seeing SIGQUIT then SIGTERM.
For the most part this shouldn't be an issue because the
SIGQUIT handlers will just do _exit(2) anyway.  However,
pgarch.c is a bit out of step.  For one thing, its SIGTERM
handler has a comment saying the postmaster will never send
SIGTERM, which needs to be adjusted.  For another, its
SIGQUIT handler does exit(1) not _exit(2), which seems rather
dubious ... should we make it more like the rest?  I think
the reasoning there might've been that if some DBA decides to
SIGQUIT the archiver, we don't need to force a database-wide
reset; but why exactly should we tolerate that?
            regards, tom lane
			
		From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > SIGTERM, which needs to be adjusted. For another, its > SIGQUIT handler does exit(1) not _exit(2), which seems rather > dubious ... should we make it more like the rest? I think > the reasoning there might've been that if some DBA decides to > SIGQUIT the archiver, we don't need to force a database-wide > reset; but why exactly should we tolerate that? postmaster doesn't distinguish return codes other than 0 for the archiver, and just starts the archiver unless postmasteris shutting down. So we can use _exit(2) like the other children. Can't we use SIGKILL instead of SIGINT/SIGTERM to stop the grandchildren, just in case they are slow to respond to or ignoreSIGINT/SIGTERM? That matches the idea of pg_ctl's immediate shutdown. (Windows cannot stop grandchildren because kill() in src/port/kill.c doesn't support the process group... That's a separatetopic.) Regards Takayuki Tsunakawa
At Wed, 11 Sep 2019 01:36:15 +0000, "Tsunakawa, Takayuki" <tsunakawa.takay@jp.fujitsu.com> wrote in <0A3221C70F24FB45833433255569204D1FD33579@G01JPEXMBYT05> > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > > SIGTERM, which needs to be adjusted. For another, its > > SIGQUIT handler does exit(1) not _exit(2), which seems rather > > dubious ... should we make it more like the rest? I think > > the reasoning there might've been that if some DBA decides to > > SIGQUIT the archiver, we don't need to force a database-wide > > reset; but why exactly should we tolerate that? > > postmaster doesn't distinguish return codes other than 0 for the archiver, and just starts the archiver unless postmasteris shutting down. So we can use _exit(2) like the other children. > > Can't we use SIGKILL instead of SIGINT/SIGTERM to stop the grandchildren, just in case they are slow to respond to or ignoreSIGINT/SIGTERM? That matches the idea of pg_ctl's immediate shutdown. Perhaps +1.. immediate -> SIGKILL fast -> SIGTERM? > (Windows cannot stop grandchildren because kill() in src/port/kill.c doesn't support the process group... That's a separatetopic.) reagards. -- Kyotaro Horiguchi NTT Open Source Software Center
On 9/10/19 9:36 PM, Tsunakawa, Takayuki wrote: > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] >> SIGTERM, which needs to be adjusted. For another, its >> SIGQUIT handler does exit(1) not _exit(2), which seems rather >> dubious ... should we make it more like the rest? I think >> the reasoning there might've been that if some DBA decides to >> SIGQUIT the archiver, we don't need to force a database-wide >> reset; but why exactly should we tolerate that? > > Can't we use SIGKILL instead of SIGINT/SIGTERM to stop the grandchildren, just in case they are slow to respond to or ignoreSIGINT/SIGTERM? That matches the idea of pg_ctl's immediate shutdown. -1, at least not immediately. Archivers can be complex processes and they should be given the chance to do a graceful shutdown. Regards, -- -David david@pgmasters.net
From: David Steele [mailto:david@pgmasters.net] > > Can't we use SIGKILL instead of SIGINT/SIGTERM to stop the grandchildren, > just in case they are slow to respond to or ignore SIGINT/SIGTERM? That > matches the idea of pg_ctl's immediate shutdown. > > -1, at least not immediately. Archivers can be complex processes and > they should be given the chance to do a graceful shutdown. I agree that the user's archiver program should receive the chance for graceful stop in smart or fast shutdown. But I thinkin immediate shutdown, all should stop immediately. That's what I expect from the word "immediate." If the grandchildren left running don't disturb the cleanup of PostgreSQL's resources (shared memory, file/directory access,etc.) or restart of PostgreSQL, we may well be able to just advice the grandchildren to stop immediately with SIGINT/SIGTERM. However, for example, in the failover of shared-disk HA clustering, when the clustering software stops PostgreSQLwith "pg_ctl stop -m immediate" and then tries to unmount the file systems for $PGDATA and archived WAL, the unmount may take time or fail due to the access from PostgreSQL's grandchildren. Regards Takayuki Tsunakawa
At Wed, 11 Sep 2019 11:01:24 +0900 (Tokyo Standard Time), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in
<20190911.110124.96874741.horikyota.ntt@gmail.com>
> At Wed, 11 Sep 2019 01:36:15 +0000, "Tsunakawa, Takayuki" <tsunakawa.takay@jp.fujitsu.com> wrote in
<0A3221C70F24FB45833433255569204D1FD33579@G01JPEXMBYT05>
> > From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> > > SIGTERM, which needs to be adjusted.  For another, its
> > > SIGQUIT handler does exit(1) not _exit(2), which seems rather
> > > dubious ... should we make it more like the rest?  I think
> > > the reasoning there might've been that if some DBA decides to
> > > SIGQUIT the archiver, we don't need to force a database-wide
> > > reset; but why exactly should we tolerate that?
> > 
> > postmaster doesn't distinguish return codes other than 0 for the archiver, and just starts the archiver unless
postmasteris shutting down.  So we can use _exit(2) like the other children.
 
> > 
> > Can't we use SIGKILL instead of SIGINT/SIGTERM to stop the grandchildren, just in case they are slow to respond to
orignore SIGINT/SIGTERM?  That matches the idea of pg_ctl's immediate shutdown.
 
> 
> Perhaps +1..  immediate -> SIGKILL  fast -> SIGTERM?
We send SIGUSR2 to archive while fast shutdown. It would be
enough for pg_system to do signal(SNGINT, DIG_DFL) andsignal
signal(SIGQUIT, SIG_IGN) then remember the child's pid somewhere.
Then each process send SIGKILL to the remembered process in
sigquit handler.  (I'm not sure what happens if kill(0,
SIGKILL)).
regards.
-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/postmaster/pgarch.c b/src/backend/postmaster/pgarch.c
index f84f882c4c..1fe7b5dd0a 100644
--- a/src/backend/postmaster/pgarch.c
+++ b/src/backend/postmaster/pgarch.c
@@ -108,6 +108,50 @@ static bool pgarch_archiveXlog(char *xlog);
 static bool pgarch_readyXlog(char *xlog);
 static void pgarch_archiveDone(char *xlog);
 
+pid_t system_pid = -1;
+
+static void
+pg_system_kill()
+{
+    if (system_pid >= 0)
+        kill(system_pid, SIGKILL);
+}
+
+static int
+pg_system(char *cmd)
+{
+  int status;
+  long result;
+
+
+  system_pid = fork();
+  if (system_pid == 0)
+  {
+      char *new_argv[4] = {"sh", "-c", cmd, NULL};
+
+      pqsignal(SIGINT, SIG_DFL);
+      pqsignal(SIGQUIT, SIG_IGN);
+      (void) execve ("/bin/sh", (char * const *) new_argv, __environ);
+      _exit(127);
+  }
+
+  if (system_pid < 0)
+      return -1;
+
+  do
+  {
+      result = waitpid(system_pid, &status, 0);
+  } while (result == -1L && errno == EINTR);
+
+  if (result != system_pid)
+  {
+      system_pid = -1;
+      return -1;
+  }
+
+  system_pid = -1;
+  return status;
+}
 
 /* ------------------------------------------------------------
  * Public functions called from postmaster follow
@@ -255,6 +299,9 @@ PgArchiverMain(int argc, char *argv[])
 static void
 pgarch_exit(SIGNAL_ARGS)
 {
+    if (postgres_signal_arg == SIGQUIT)
+        pg_system_kill();
+
     /* SIGQUIT means curl up and die ... */
     exit(1);
 }
@@ -620,7 +667,7 @@ pgarch_archiveXlog(char *xlog)
     snprintf(activitymsg, sizeof(activitymsg), "archiving %s", xlog);
     set_ps_display(activitymsg, false);
 
-    rc = system(xlogarchcmd);
+    rc = pg_system(xlogarchcmd);
     if (rc != 0)
     {
         /*
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index a5446d54bb..d9b85a5228 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -3964,6 +3964,8 @@ PostmasterStateMachine(void)
 static void
 signal_child(pid_t pid, int signal)
 {
+    if (pid == PgArchPID)
+        elog(LOG, "Send %d to archiver", signal);
     if (kill(pid, signal) < 0)
         elog(DEBUG3, "kill(%ld,%d) failed: %m", (long) pid, signal);
 #ifdef HAVE_SETSID