Re: common signal handler protection

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: common signal handler protection
Дата
Msg-id 20231122225945.3kgclsgz5lqmtnan@awork3.anarazel.de
обсуждение исходный текст
Ответ на Re: common signal handler protection  (Nathan Bossart <nathandbossart@gmail.com>)
Ответы Re: common signal handler protection  (Nathan Bossart <nathandbossart@gmail.com>)
Список pgsql-hackers
Hi,

On 2023-11-22 15:59:44 -0600, Nathan Bossart wrote:
> Subject: [PATCH v2 1/3] Check that MyProcPid == getpid() in all signal
>  handlers.
> 
> In commit 97550c0711, we added a similar check to the SIGTERM
> handler for the startup process.  This commit adds this check to
> all signal handlers installed with pqsignal().  This is done by
> using a wrapper function that performs the check before calling the
> actual handler.
> 
> The hope is that this will offer more general protection against
> grandchildren processes inadvertently modifying shared memory due
> to inherited signal handlers.

It's a bit unclear here what grandchildren refers to - it's presumably in
reference to postmaster, but the preceding text doesn't even mention
postmaster. I'd probably just say "child processes of the current process.


> +
> +#ifdef PG_SIGNAL_COUNT            /* Windows */
> +#define PG_NSIG (PG_SIGNAL_COUNT)
> +#elif defined(NSIG)
> +#define PG_NSIG (NSIG)
> +#else
> +#define PG_NSIG (64)            /* XXX: wild guess */
> +#endif

Perhaps worth adding a static assert for at least a few common types of
signals being below that value? That way we'd see a potential issue without
needing to reach the code path.


> +/*
> + * Except when called with SIG_IGN or SIG_DFL, pqsignal() sets up this function
> + * as the handler for all signals.  This wrapper handler function checks that
> + * it is called within a process that the server knows about, and not a
> + * grandchild process forked by system(3), etc.

Similar comment to earlier - the grandchildren bit seems like a dangling
reference. And also too specific - I think we could encounter this in single
user mode as well?

Perhaps it could be reframed to "postgres processes, as determined by having
called InitProcessGlobals()"?


>This check ensures that such
> + * grandchildren processes do not modify shared memory, which could be
> + * detrimental.

"could be" seems a bit euphemistic :)


> From b77da9c54590a71eb9921d6f16bf4ffb0543e87e Mon Sep 17 00:00:00 2001
> From: Nathan Bossart <nathan@postgresql.org>
> Date: Fri, 17 Nov 2023 14:00:12 -0600
> Subject: [PATCH v2 2/3] Centralize logic for restoring errno in signal
>  handlers.
> 
> Presently, we rely on each individual signal handler to save the
> initial value of errno and then restore it before returning if
> needed.  This is easily forgotten and, if missed, often goes
> undetected for a long time.

It's also just verbose :)


> From 5734e0cf5f00bbd74504b45934f68e1c2c1290bd Mon Sep 17 00:00:00 2001
> From: Nathan Bossart <nathan@postgresql.org>
> Date: Fri, 17 Nov 2023 22:09:24 -0600
> Subject: [PATCH v2 3/3] Revert "Avoid calling proc_exit() in processes forked
>  by system()."
> 
> Thanks to commit XXXXXXXXXX, this check in the SIGTERM handler for
> the startup process is now obsolete and can be removed.  Instead of
> leaving around the elog(PANIC, ...) calls that are now unlikely to
> be triggered and the dead function write_stderr_signal_safe(), I've
> opted to just remove them for now.  Thanks to modern version
> control software, it will be trivial to dig those up if they are
> ever needed in the future.
> 
> This reverts commit 97550c0711972a9856b5db751539bbaf2f88884c.
> 
> Reviewed-by: ???
> Discussion: ???
> ---
>  src/backend/postmaster/startup.c | 17 +----------------
>  src/backend/storage/ipc/ipc.c    |  4 ----
>  src/backend/storage/lmgr/proc.c  |  8 --------
>  src/backend/utils/error/elog.c   | 28 ----------------------------
>  src/include/utils/elog.h         |  6 ------
>  5 files changed, 1 insertion(+), 62 deletions(-)
> 
> diff --git a/src/backend/postmaster/startup.c b/src/backend/postmaster/startup.c
> index 83dbed86b9..f40acd20ff 100644
> --- a/src/backend/postmaster/startup.c
> +++ b/src/backend/postmaster/startup.c
> @@ -19,8 +19,6 @@
>   */
>  #include "postgres.h"
>  
> -#include <unistd.h>
> -
>  #include "access/xlog.h"
>  #include "access/xlogrecovery.h"
>  #include "access/xlogutils.h"
> @@ -113,20 +111,7 @@ static void
>  StartupProcShutdownHandler(SIGNAL_ARGS)
>  {
>      if (in_restore_command)
> -    {
> -        /*
> -         * If we are in a child process (e.g., forked by system() in
> -         * RestoreArchivedFile()), we don't want to call any exit callbacks.
> -         * The parent will take care of that.
> -         */
> -        if (MyProcPid == (int) getpid())
> -            proc_exit(1);
> -        else
> -        {
> -            write_stderr_signal_safe("StartupProcShutdownHandler() called in child process\n");
> -            _exit(1);
> -        }
> -    }
> +        proc_exit(1);
>      else
>          shutdown_requested = true;
>      WakeupRecovery();

Hm. I wonder if this indicates an issue.  Do the preceding changes perhaps
make it more likely that a child process of the startup process could hang
around for longer, because the signal we're delivering doesn't terminate child
processes, because we'd just reset the signal handler?  I think it's fine for
the startup process, because we ask the startup process to shut down with
SIGTERM, which defaults to exiting.

But we do have a few processes that we do ask to shut down with other
signals, wich do not trigger an exit by default, e.g. Checkpointer, archiver,
walsender are asked to shut down using SIGUSR2 IIRC.  The only one where that
could be an issue is archiver, I guess?


> diff --git a/src/backend/storage/ipc/ipc.c b/src/backend/storage/ipc/ipc.c
> index 6591b5d6a8..1904d21795 100644
> --- a/src/backend/storage/ipc/ipc.c
> +++ b/src/backend/storage/ipc/ipc.c
> @@ -103,10 +103,6 @@ static int    on_proc_exit_index,
>  void
>  proc_exit(int code)
>  {
> -    /* not safe if forked by system(), etc. */
> -    if (MyProcPid != (int) getpid())
> -        elog(PANIC, "proc_exit() called in child process");
> -
>      /* Clean up everything that must be cleaned up */
>      proc_exit_prepare(code);

> diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
> index e9e445bb21..5b663a2997 100644
> --- a/src/backend/storage/lmgr/proc.c
> +++ b/src/backend/storage/lmgr/proc.c
> @@ -806,10 +806,6 @@ ProcKill(int code, Datum arg)
>  
>      Assert(MyProc != NULL);
>  
> -    /* not safe if forked by system(), etc. */
> -    if (MyProc->pid != (int) getpid())
> -        elog(PANIC, "ProcKill() called in child process");
> -
>      /* Make sure we're out of the sync rep lists */
>      SyncRepCleanupAtProcExit();
>  
> @@ -930,10 +926,6 @@ AuxiliaryProcKill(int code, Datum arg)
>  
>      Assert(proctype >= 0 && proctype < NUM_AUXILIARY_PROCS);
>  
> -    /* not safe if forked by system(), etc. */
> -    if (MyProc->pid != (int) getpid())
> -        elog(PANIC, "AuxiliaryProcKill() called in child process");
> -
>      auxproc = &AuxiliaryProcs[proctype];
>  
>      Assert(MyProc == auxproc);

I think we should leave these checks. It's awful to debug situations where a
proc gets reused and the cost of the checks is irrelevant. The checks don't
just protect against child processes, they also protect e.g. against calling
those functions twice (we IIRC had cases of that).

Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: [HACKERS] Changing references of password encryption to hashing
Следующее
От: Peter Geoghegan
Дата:
Сообщение: Re: Optionally using a better backtrace library?