Re: Parallel threads in query

Поиск
Список
Период
Сортировка
От Darafei "Komяpa" Praliaskouski
Тема Re: Parallel threads in query
Дата
Msg-id CAC8Q8tKRMRTBSDqaD5NEsm7HtAX2F7B0YJsZOQt1pFiF8nzOPg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Parallel threads in query  (Andres Freund <andres@anarazel.de>)
Ответы Re: Parallel threads in query  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers

Because you said "faster than reasonable IPC" - which to me implies that
you don't do full blown IPC. Which using threads in a bgworker is very
strongly implying. What you're proposing strongly implies multiple
context switches just to process a few results. Even before, but
especially after, spectre that's an expensive proposition.


To have some idea of what it could be:

a)
PostGIS has ST_ClusterKMeans window function. It collects all geometries passed to it to memory, re-packs to more compact buffer and starts a loop that goes over it several (let's say 10..100) times. Then it spits out all the assigned cluster numbers for each of the input rows.

It's all great when you need to calculate KMeans of 200-50000 rows, but for a million input rows even a hundred passes on a single core are painful.

b) 
PostGIS has ST_Subdivide function. It takes a single row of geometry (usually super-large, like a continent or the wholeness of Russia) and splits it into many rows that have more simple shape, by performing a horizontal or vertical split recursively. Since it's a tree traversal, it can be paralleled efficiently, with one task becoming to follow the right subpart of geometry and other - to follow left part of it. 

Both seem to be a standard thing for OpenMP, which has compiler support in GCC and clang and MSVC. For an overview how it work, have a look here:

So, do I understand correctly that I need to start a parallel worker that does nothing for each thread I launch to consume the parallel worker limit?
--
Darafei Praliaskouski
Support me: http://patreon.com/komzpa

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tomas Vondra
Дата:
Сообщение: Re: Doubts about pushing LIMIT to MergeAppendPath
Следующее
От: Andres Freund
Дата:
Сообщение: Re: Parallel threads in query