Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

Поиск
Список
Период
Сортировка
От knizhnik
Тема Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL
Дата
Msg-id 52CEF60F.9070206@garret.ru
обсуждение исходный текст
Ответ на Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL  (Jim Nasby <jim@nasby.net>)
Список pgsql-hackers
On 01/09/2014 09:22 PM, Robert Haas wrote:
> On Wed, Jan 8, 2014 at 2:39 PM, knizhnik <knizhnik@garret.ru> wrote:
>> I wonder what is the intended use case of dynamic shared memory?
>> Is is primarly oriented on PostgreSQL extensions or it will be used also in
>> PosatgreSQL core?
> My main motivation is that I want to use it to support parallel query.
>   There is unfortunately quite a bit of work left to be done before we
> can make that a reality, but that's the goal.

I do not want to waste your time, but this topic is very interesting to 
me and I will be very pleased if you drop few words about how DSM can 
help to implement parallel query processing?
It seems to me that the main complexity is in optimizer - it needs to 
split query plan into several subplans which can be executed 
concurrently and then merge their partial results.
As far as I understand it is not possible to use multithreading for 
parallel query execution because most of PostgreSQL code is 
non-reentrant. So we need to execute this subplans by several processes. 
And unlike threads, the only way of efficient exchanging data between 
processes is shared memory. So it is clear why do we need shared memory 
for parallel query execution. But why it has to be dynamic? Why it can 
not be preallocated at start time as most of other resources used by 
PostgreSQL?

>
>> May be I am wrong, but I do not see some reasons for creating multiple DSM
>> segments by the same extension.
> Right.
>
>> And total number of DSM segments is expected to be not very large (<10). The
>> same is true for synchronization primitives (LWLocks for example) needed to
>> synchronize access to this DSM segments. So I am not sure if possibility to
>> place locks in DSM is really so critical...
>> We can just reserved some space for LWLocks which can be used by extension,
>> so that LWLockAssign() can be used without RequestAddinLWLocks or
>> RequestAddinLWLocks can be used not only from preloaded extension.
> If you're doing all of this at postmaster startup time, that all works
> fine.  If you want to be able to load up an extension on the fly, then
> it doesn't.  You can only RequestAddinLWLocks() at postmaster start
> time, not afterwards, so currently any extension that wants to use
> lwlocks has to be loaded at postmaster startup time, or you're out of
> luck.
>
> Well.  Technically we reserve something like 3 extra lwlocks that
> could be assigned later.  But relying on those to be available is not
> very reliable, and also, 3 is not very many, considering that we have
> something north of 32k core lwlocks in the default configuration.

3 is definitely too small.
But you agreed with me that number of DSM segments will be not very large.
And if we do not need fine grain locking (and IMHO it is not needed for 
most extensions), then we need just few (most likely one) lock per DSM 
segment.
It means that if instead of 3 we reserve let's say 30 LW-locks, then it 
will be enough for most extensions. And there will be almost now extra 
resources overhead, because as you wrote PostgreSQL has 32k locks in 
default configuration.

Certainly if we need independent lock for each page of DSM memory than 
there will be no other choice except placing locks in DSM segment 
itself. But once again - I do not think that most of extension needed 
shared memory will use such fine grain locking.






В списке pgsql-hackers по дате отправления:

Предыдущее
От: Josh Berkus
Дата:
Сообщение: Re: nested hstore patch
Следующее
От: knizhnik
Дата:
Сообщение: Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL