Re: Refactoring backend fork+exec code

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: Refactoring backend fork+exec code
Дата
Msg-id 119551e6-60f6-49b2-ae04-a382a89f900c@iki.fi
обсуждение исходный текст
Ответ на Re: Refactoring backend fork+exec code  (Andres Freund <andres@anarazel.de>)
Ответы Re: Refactoring backend fork+exec code  (Heikki Linnakangas <hlinnaka@iki.fi>)
Список pgsql-hackers
I updated this patch set, addressing some of the straightforward 
comments from Tristan and Andres, and did some more cleanups, commenting 
etc. Works on Windows now.

Replies to some of the individual comments below:

On 11/07/2023 00:07, Tristan Partin wrote:
>> @@ -4498,15 +4510,19 @@ postmaster_forkexec(int argc, char *argv[])
>>   * returns the pid of the fork/exec'd process, or -1 on failure
>>   */
>>  static pid_t
>> -backend_forkexec(Port *port)
>> +backend_forkexec(Port *port, CAC_state cac)
>>  {
>> -       char       *av[4];
>> +       char       *av[5];
>>         int                     ac = 0;
>> +       char            cacbuf[10];
>>
>>         av[ac++] = "postgres";
>>         av[ac++] = "--forkbackend";
>>         av[ac++] = NULL;                        /* filled in by internal_forkexec */
>>
>> +       snprintf(cacbuf, sizeof(cacbuf), "%d", (int) cac);
>> +       av[ac++] = cacbuf;
> 
> Might be worth a sanity check that there wasn't any truncation into
> cacbuf, which is an impossibility as the code is written, but still
> useful for catching a future developer error.
> 
> Is it worth adding a command line option at all instead of having the
> naked positional argument? It would help anybody who might read the
> command line what the seemingly random integer stands for.

+1. This gets refactored away in the last patch though. In the last 
patch, I used a child process name instead of an integer precisely 
because it looks nicer in "ps".

I wonder if we should add more command line arguments, just for 
informational purposes. Autovacuum worker process could display the 
database name it's connected to, for example. I don't know how important 
the command line is on Windows, is it displayed by tools that people 
care about?

On 11/07/2023 01:50, Andres Freund wrote:
> On 2023-06-18 14:22:33 +0300, Heikki Linnakangas wrote:
>>  From 0cb6f8d665980d30a5d2a29013000744f16bf813 Mon Sep 17 00:00:00 2001
>> From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
>> Date: Sun, 18 Jun 2023 11:00:21 +0300
>> Subject: [PATCH 3/9] Refactor CreateSharedMemoryAndSemaphores.
>>
>> Moves InitProcess calls a little later in EXEC_BACKEND case.
> 
> What's the reason for this part? 

The point is that with this commit, InitProcess() is called at same 
place in EXEC_BACKEND mode and !EXEC_BACKEND. It feels more consistent 
that way.

> ISTM that we'd really want to get away from plastering duplicated
> InitProcess() etc everywhere.

Sure, we could do more to reduce the duplication. I think this is a step 
in the right direction, though.

>>  From 65384b9a6cfb3b9b589041526216e0f64d64bea5 Mon Sep 17 00:00:00 2001
>> From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
>> Date: Sun, 18 Jun 2023 13:56:44 +0300
>> Subject: [PATCH 8/9] Introduce ClientSocket, rename some funcs
>>
>> - Move more of the work on a client socket to the child process.
>>
>> - Reduce the amount of data that needs to be passed from postmaster to
>>    child. (Used to pass a full Port struct, although most of the fields were
>>    empty. Now we pass the much slimmer ClientSocket.)
> 
> I think there might be extensions accessing Port. Not sure if it's worth
> worrying about, but ...

That's OK. Port still exists, it's just created a little later. It will 
be initialized by the time extensions might look at it.

>> +const        PMChildEntry entry_kinds[] = {
>> +    {"backend", BackendMain, true},
>> +
>> +    {"autovacuum launcher", AutoVacLauncherMain, true},
>> +    {"autovacuum worker", AutoVacWorkerMain, true},
>> +    {"bgworker", BackgroundWorkerMain, true},
>> +    {"syslogger", SysLoggerMain, false},
>> +
>> +    {"startup", StartupProcessMain, true},
>> +    {"bgwriter", BackgroundWriterMain, true},
>> +    {"archiver", PgArchiverMain, true},
>> +    {"checkpointer", CheckpointerMain, true},
>> +    {"wal_writer", WalWriterMain, true},
>> +    {"wal_receiver", WalReceiverMain, true},
>> +};
> 
> I'd assign them with the PostmasterChildType as index, so there's no danger of
> getting out of order.
> 
> const                PMChildEntry entry_kinds = {
>    [PMC_AV_LAUNCHER] = {"autovacuum launcher", AutoVacLauncherMain, true},
>    ...
> }
> 
> or such should work.

Nice, I didn't know about that syntax! Changed it that way.

> I'd also use designated initializers for the fields, it's otherwise hard to
> know what true means etc.

I think with one boolean and the struct declaration nearby, it's fine. 
If this becomes more complex in the future, with more fields, I agree.

> I think it might be good to put more into array. If we e.g. knew whether a
> particular child type is a backend-like, and aux process or syslogger, we
> could avoid the duplicated InitAuxiliaryProcess(),
> MemoryContextDelete(PostmasterContext) etc calls everywhere.

I agree we could do more refactoring here. I don't agree with adding 
more to this struct though. I'm trying to limit the code in 
launch_backend.c to hiding the differences between EXEC_BACKEND and 
!EXEC_BACKEND. In EXEC_BACKEND mode, it restores the child process to 
the same state as it is after fork() in !EXEC_BACKEND mode. Any other 
initialization steps belong elsewhere.

Some of the steps between InitPostmasterChild() and the *Main() 
functions could probably be moved around and refactored. I didn't think 
hard about that. I think that can be done separately as follow-up patch.

>> +/* Save critical backend variables into the BackendParameters struct */
>> +#ifndef WIN32
>> +static bool
>> +save_backend_variables(BackendParameters *param, ClientSocket *client_sock)
>> +#else
> 
> There's so much of this kind of thing. Could we hide it in a struct or such
> instead of needing ifdefs everywhere?

A lot of #ifdefs you mean? I agree launch_backend.c has a lot of those. 
I haven't come up with any good ideas on reducing them, unfortunately.

-- 
Heikki Linnakangas
Neon (https://neon.tech)
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Dilip Kumar
Дата:
Сообщение: SLRU optimization - configurable buffer pool and partitioning the SLRU lock
Следующее
От: Heikki Linnakangas
Дата:
Сообщение: Re: Is this a problem in GenericXLogFinish()?