Re: Refactoring backend fork+exec code

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: Refactoring backend fork+exec code
Дата
Msg-id 20230710225043.svl7fqxecwshwc7a@awork3.anarazel.de
обсуждение исходный текст
Ответ на Refactoring backend fork+exec code  (Heikki Linnakangas <hlinnaka@iki.fi>)
Ответы Use FD_CLOEXEC on ListenSockets (was Re: Refactoring backend fork+exec code)
Re: Refactoring backend fork+exec code
Список pgsql-hackers
Hi,

On 2023-06-18 14:22:33 +0300, Heikki Linnakangas wrote:
> I started to look at the code in postmaster.c related to launching child
> processes. I tried to reduce the difference between EXEC_BACKEND and
> !EXEC_BACKEND code paths, and put the code that needs to differ behind a
> better abstraction. I started doing this to help with implementing
> multi-threading, but it doesn't introduce anything thread-related yet and I
> think this improves readability anyway.

Yes please!  This code is absolutely awful.


> From 0cb6f8d665980d30a5d2a29013000744f16bf813 Mon Sep 17 00:00:00 2001
> From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
> Date: Sun, 18 Jun 2023 11:00:21 +0300
> Subject: [PATCH 3/9] Refactor CreateSharedMemoryAndSemaphores.
> 
> Moves InitProcess calls a little later in EXEC_BACKEND case.

What's the reason for this part? ISTM that we'd really want to get away from
plastering duplicated InitProcess() etc everywhere.

I think this might be easier to understand if you just changed did the
CreateSharedMemoryAndSemaphores() -> AttachSharedMemoryAndSemaphores() piece
in this commit, and the rest later.


> +void
> +AttachSharedMemoryAndSemaphores(void)
> +{
> +    /* InitProcess must've been called already */

Perhaps worth an assertion to make it easier to see that the order is wrong?


> From 1d89eec53c7fefa7a4a8c011c9f19e3df64dc436 Mon Sep 17 00:00:00 2001
> From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
> Date: Mon, 12 Jun 2023 16:33:20 +0300
> Subject: [PATCH 4/9] Use FD_CLOEXEC on ListenSockets
> 
> We went through some effort to close them in the child process. Better to
> not hand them down to the child process in the first place.

I think Thomas has a larger version of this patch:
https://postgr.es/m/CA%2BhUKGKPNFcfBQduqof4-7C%3DavjcSfdkKBGvQoRuAvfocnvY0A%40mail.gmail.com



> From 65384b9a6cfb3b9b589041526216e0f64d64bea5 Mon Sep 17 00:00:00 2001
> From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
> Date: Sun, 18 Jun 2023 13:56:44 +0300
> Subject: [PATCH 8/9] Introduce ClientSocket, rename some funcs
> 
> - Move more of the work on a client socket to the child process.
> 
> - Reduce the amount of data that needs to be passed from postmaster to
>   child. (Used to pass a full Port struct, although most of the fields were
>   empty. Now we pass the much slimmer ClientSocket.)

I think there might be extensions accessing Port. Not sure if it's worth
worrying about, but ...


> --- a/src/backend/postmaster/autovacuum.c
> +++ b/src/backend/postmaster/autovacuum.c
> @@ -476,8 +476,8 @@ AutoVacLauncherMain(int argc, char *argv[])
>      pqsignal(SIGCHLD, SIG_DFL);
>  
>      /*
> -     * Create a per-backend PGPROC struct in shared memory. We must do
> -     * this before we can use LWLocks.
> +     * Create a per-backend PGPROC struct in shared memory. We must do this
> +     * before we can use LWLocks.
>       */
>      InitProcess();
>

Don't think this was intentional?


> From b33cfeb28a5419045acb659a01410b2b463bea3e Mon Sep 17 00:00:00 2001
> From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
> Date: Sun, 18 Jun 2023 13:59:48 +0300
> Subject: [PATCH 9/9] Refactor postmaster child process launching
> 
> - Move code related to launching backend processes to new source file,
>   process_start.c

I think you might have renamed this to launch_backend.c?


> - Introduce new postmaster_child_launch() function that deals with the
>   differences between EXEC_BACKEND and fork mode.
> 
> - Refactor the mechanism of passing informaton from the parent to
>   child process. Instead of using different command-line arguments
>   when launching the child process in EXEC_BACKEND mode, pass a
>   variable-length blob of data along with all the global
>   variables. The contents of that blob depends on the kind of child
>   process being launched. In !EXEC_BACKEND mode, we use the same blob,
>   but it's simply inherited from the parent to child process.


> +const        PMChildEntry entry_kinds[] = {
> +    {"backend", BackendMain, true},
> +
> +    {"autovacuum launcher", AutoVacLauncherMain, true},
> +    {"autovacuum worker", AutoVacWorkerMain, true},
> +    {"bgworker", BackgroundWorkerMain, true},
> +    {"syslogger", SysLoggerMain, false},
> +
> +    {"startup", StartupProcessMain, true},
> +    {"bgwriter", BackgroundWriterMain, true},
> +    {"archiver", PgArchiverMain, true},
> +    {"checkpointer", CheckpointerMain, true},
> +    {"wal_writer", WalWriterMain, true},
> +    {"wal_receiver", WalReceiverMain, true},
> +};

I'd assign them with the PostmasterChildType as index, so there's no danger of
getting out of order.

const                PMChildEntry entry_kinds = {
  [PMC_AV_LAUNCHER] = {"autovacuum launcher", AutoVacLauncherMain, true},
  ...
}

or such should work.


I'd also use designated initializers for the fields, it's otherwise hard to
know what true means etc.

I think it might be good to put more into array. If we e.g. knew whether a
particular child type is a backend-like, and aux process or syslogger, we
could avoid the duplicated InitAuxiliaryProcess(),
MemoryContextDelete(PostmasterContext) etc calls everywhere.


> +/*
> + * SubPostmasterMain -- Get the fork/exec'd process into a state equivalent
> + *            to what it would be if we'd simply forked on Unix, and then
> + *            dispatch to the appropriate place.
> + *
> + * The first two command line arguments are expected to be "--forkFOO"
> + * (where FOO indicates which postmaster child we are to become), and
> + * the name of a variables file that we can read to load data that would
> + * have been inherited by fork() on Unix.  Remaining arguments go to the
> + * subprocess FooMain() routine. XXX
> + */
> +void
> +SubPostmasterMain(int argc, char *argv[])
> +{
> +    PostmasterChildType child_type;
> +    char       *startup_data;
> +    size_t        startup_data_len;
> +
> +    /* In EXEC_BACKEND case we will not have inherited these settings */
> +    IsPostmasterEnvironment = true;
> +    whereToSendOutput = DestNone;
> +
> +    /* Setup essential subsystems (to ensure elog() behaves sanely) */
> +    InitializeGUCOptions();
> +
> +    /* Check we got appropriate args */
> +    if (argc < 3)
> +        elog(FATAL, "invalid subpostmaster invocation");
> +
> +    if (strncmp(argv[1], "--forkchild=", 12) == 0)
> +    {
> +        char       *entry_name = argv[1] + 12;
> +        bool        found = false;
> +
> +        for (int idx = 0; idx < lengthof(entry_kinds); idx++)
> +        {
> +            if (strcmp(entry_kinds[idx].name, entry_name) == 0)
> +            {
> +                child_type = idx;
> +                found = true;
> +                break;
> +            }
> +        }
> +        if (!found)
> +            elog(ERROR, "unknown child kind %s", entry_name);
> +    }


Hm, shouldn't we error out when called without --forkchild?


> +/* Save critical backend variables into the BackendParameters struct */
> +#ifndef WIN32
> +static bool
> +save_backend_variables(BackendParameters *param, ClientSocket *client_sock)
> +#else

There's so much of this kind of thing. Could we hide it in a struct or such
instead of needing ifdefs everywhere?



> --- a/src/backend/storage/ipc/shmem.c
> +++ b/src/backend/storage/ipc/shmem.c
> @@ -144,6 +144,8 @@ InitShmemAllocation(void)
>      /*
>       * Initialize ShmemVariableCache for transaction manager. (This doesn't
>       * really belong here, but not worth moving.)
> +     *
> +     * XXX: we really should move this
>       */
>      ShmemVariableCache = (VariableCache)
>          ShmemAlloc(sizeof(*ShmemVariableCache));

Heh. Indeed. And probably just rename it to something less insane.


Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: Generating code for query jumbling through gen_node_support.pl
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: Autogenerate some wait events code and documentation