Re: Let's make PostgreSQL multi-threaded

Поиск
Список
Период
Сортировка
От Greg Stark
Тема Re: Let's make PostgreSQL multi-threaded
Дата
Msg-id CAM-w4HPne2ab_ppKO6xSY+gyrczMu7CnFzggP+4mXqD1ctjh-A@mail.gmail.com
обсуждение исходный текст
Ответ на Let's make PostgreSQL multi-threaded  (Heikki Linnakangas <hlinnaka@iki.fi>)
Ответы Re: Let's make PostgreSQL multi-threaded  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
On Mon, 5 Jun 2023 at 10:52, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>
> I spoke with some folks at PGCon about making PostgreSQL multi-threaded,
> so that the whole server runs in a single process, with multiple
> threads. It has been discussed many times in the past, last thread on
> pgsql-hackers was back in 2017 when Konstantin made some experiments [0].
>
> I feel that there is now pretty strong consensus that it would be a good
> thing, more so than before. Lots of work to get there, and lots of
> details to be hashed out, but no objections to the idea at a high level.
>
> The purpose of this email is to make that silent consensus explicit. If
> you have objections to switching from the current multi-process
> architecture to a single-process, multi-threaded architecture, please
> speak up.

I suppose I should reiterate my comments that I gave at the time. I'm
not sure they qualify as "objections" but they're some kind of general
concern.

I think of processes and threads as fundamentally the same things,
just a slightly different API -- namely that in one memory is by
default unshared and needs to be explicitly shared and in the other
it's default shared and needs to be explicitly unshared. There are
obvious practical API differences too like how signals are handled but
those are just implementation details.

So the question is whether defaulting to shared memory or defaulting
to unshared memory is better -- and whether the implementation details
are significant enough to override that.

And my general concern was that in my experience default shared memory
leads to hugely complex and chaotic shared data structures with often
very loose rules for ownership of shared data and who is responsible
for making updates, handling errors, or releasing resources.

So all else equal I feel like having a good infrastructure for
explicitly allocating shared memory segments and managing them is
superior.

However all else is not equal. The discussion in the hallway turned to
whether we could just use pthread primitives like mutexes and
condition variables instead of our own locks -- and the point was
raised that those libraries assume these objects will be in threads of
one process not shared across completely different processes.

And that's probably not the only library we're stuck reimplementing
because of this. So the question is are these things worth taking the
risk of having data structures shared implicitly and having unclear
ownership rules?

I was going to say supporting both modes relieves that fear since it
would force that extra discipline and allow testing under the more
restrictive rule. However I don't think that will actually work. As
long as we support both modes we lose all the advantages of threads.
We still wouldn't be able to use pthreads and would still need to
provide and maintain our homegrown replacement infrastructure.




-- 
greg



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: Assert failure of the cross-check for nullingrels
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Order changes in PG16 since ICU introduction