Re: Let's make PostgreSQL multi-threaded

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: Let's make PostgreSQL multi-threaded
Дата
Msg-id 20230607220919.pilzcbqnp2rwfbl4@awork3.anarazel.de
обсуждение исходный текст
Ответ на Re: Let's make PostgreSQL multi-threaded  (Greg Stark <stark@mit.edu>)
Ответы Re: Let's make PostgreSQL multi-threaded  (Hannu Krosing <hannuk@google.com>)
Re: Let's make PostgreSQL multi-threaded  (Greg Stark <stark@mit.edu>)
Список pgsql-hackers
Hi,

On 2023-06-06 16:14:41 -0400, Greg Stark wrote:
> I think of processes and threads as fundamentally the same things,
> just a slightly different API -- namely that in one memory is by
> default unshared and needs to be explicitly shared and in the other
> it's default shared and needs to be explicitly unshared.

In theory that's true, in practice it's entirely wrong.

For one, the amount of complexity you need to deal with to share state across
processes, post fork, is *substantial*.  You can share file descriptors across
processes, but it's extremely platform dependant, requires cooperation between
both processes etc.  You can share memory allocations made after the processes
forked, but you're typically not going to be able to guarantee they're at the
same pointer values. Etc.

But more importantly, there's crucial performance differences between threads
and processes. Having the same memory mapping between threads makes allows the
hardware to share the TLB (on x86 via process context identifiers), which
isn't realistically possible with different processes.


> However all else is not equal. The discussion in the hallway turned to
> whether we could just use pthread primitives like mutexes and
> condition variables instead of our own locks -- and the point was
> raised that those libraries assume these objects will be in threads of
> one process not shared across completely different processes.

Independent of threads vs processes, I am -many on using pthread mutexes and
condition variables. From experiments, that *looses* performance, and we loose
a lot of control and increase cross-platform behavioural differences.  I also
don't see any benefit in going in that direction.


> And that's probably not the only library we're stuck reimplementing
> because of this. So the question is are these things worth taking the
> risk of having data structures shared implicitly and having unclear
> ownership rules?
> 
> I was going to say supporting both modes relieves that fear since it
> would force that extra discipline and allow testing under the more
> restrictive rule. However I don't think that will actually work. As
> long as we support both modes we lose all the advantages of threads.

I don't think that has to be true. We could e.g. eventually decide that we
don't support parallel query without threading support - which would allow us
to get rid of a very significant amount of code and runtime overhead.

Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Noah Misch
Дата:
Сообщение: Re: v16 fails to build w/ Visual Studio 2015
Следующее
От: Peter Eisentraut
Дата:
Сообщение: Re: Order changes in PG16 since ICU introduction