Обсуждение: Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

Поиск

Список

Период

Сортировка

Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

От

David Fetter

Дата:

04 января 2014 г., 02:21:26

On Thu, Jan 02, 2014 at 08:48:24PM +0400, knizhnik wrote:
> I want to announce implementation of In-Memory Columnar Store
> extension for PostgreSQL.
> Vertical representation of data is stored in PostgreSQL shared memory.

Thanks for the hard work!

I noticed a couple of things about this that probably need some
improvement.

1.  There are unexplained patches against other parts of PostgreSQL,
which means that they may break other parts of PostgreSQL in equally
inexplicable ways.  Please rearrange the patch so it doesn't require
this.  This leads to:

2.  The add-on is not formatted as an EXTENSION, which would allow
people to add it or remove it cleanly.

Would you be so kind as to fix these?

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

От

knizhnik

Дата:

04 января 2014 г., 10:46:41

Hi David,

Sorry, but I do not completely understand your suggestions:

1. IMCS really contains single patch file sysv_shmem.patch.
Applying this patch is not mandatory for using IMCS: it just solves the 
problem with support of > 256Gb of shared memory.
Right now PostgreSQL is not able to use more than 256Gb shared buffers 
at Linux with standard 4kb pages.
I have found proposal for using MAP_HUGETLB flag in commit fest:

http://www.postgresql.org/message-id/20131125032920.GA23793@toroid.org

but unfortunately it was rejected. Hugepages are intensively used by 
Oracle and I think that them will be useful for improving performance of 
PorstreSQL. So not just IMCS can benefit from this patch. My patch  is 
much more simple - I specially limited scope of this patch to one file. 
Certainly switch huge tlb on/off should be done through postgresql.conf 
configuration file.

In any case - IMCS can be used without this patch: you just could not 
use more than 256Gb memory, even if your system has more RAM.

2. I do not understand "The add-on is not formatted as an EXTENSION"
IMCS was created as standard extension - I just look at the examples of 
other PostgreSQL extensions included in PostgreSQL distribution
(for example pg_stat_statements). It can be added using "create 
extension imcs" and removed "drop extension imcs" commands.

If there are some violations of PostgreSQL extensions rules, please let 
me know, I will fix them.
But I thought that I have done everything in legal way.

On 01/04/2014 03:21 AM, David Fetter wrote:
> On Thu, Jan 02, 2014 at 08:48:24PM +0400, knizhnik wrote:
>> I want to announce implementation of In-Memory Columnar Store
>> extension for PostgreSQL.
>> Vertical representation of data is stored in PostgreSQL shared memory.
> Thanks for the hard work!
>
> I noticed a couple of things about this that probably need some
> improvement.
>
> 1.  There are unexplained patches against other parts of PostgreSQL,
> which means that they may break other parts of PostgreSQL in equally
> inexplicable ways.  Please rearrange the patch so it doesn't require
> this.  This leads to:
>
> 2.  The add-on is not formatted as an EXTENSION, which would allow
> people to add it or remove it cleanly.
>
> Would you be so kind as to fix these?
>
> Cheers,
> David.

Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

От

David Fetter

Дата:

04 января 2014 г., 11:05:16

I'm sorry I misunderstood about the extension you wrote.

Is there some way not to use shared memory for it?

Cheers,
David.

On Sat, Jan 04, 2014 at 11:46:25AM +0400, knizhnik wrote:
> Hi David,
> 
> Sorry, but I do not completely understand your suggestions:
> 
> 1. IMCS really contains single patch file sysv_shmem.patch.
> Applying this patch is not mandatory for using IMCS: it just solves
> the problem with support of > 256Gb of shared memory.
> Right now PostgreSQL is not able to use more than 256Gb shared
> buffers at Linux with standard 4kb pages.
> I have found proposal for using MAP_HUGETLB flag in commit fest:
> 
> http://www.postgresql.org/message-id/20131125032920.GA23793@toroid.org
> 
> but unfortunately it was rejected. Hugepages are intensively used by
> Oracle and I think that them will be useful for improving
> performance of PorstreSQL. So not just IMCS can benefit from this
> patch. My patch  is much more simple - I specially limited scope of
> this patch to one file. Certainly switch huge tlb on/off should be
> done through postgresql.conf configuration file.
> 
> In any case - IMCS can be used without this patch: you just could
> not use more than 256Gb memory, even if your system has more RAM.
> 
> 2. I do not understand "The add-on is not formatted as an EXTENSION"
> IMCS was created as standard extension - I just look at the examples
> of other PostgreSQL extensions included in PostgreSQL distribution
> (for example pg_stat_statements). It can be added using "create
> extension imcs" and removed "drop extension imcs" commands.
> 
> If there are some violations of PostgreSQL extensions rules, please
> let me know, I will fix them.
> But I thought that I have done everything in legal way.
> 
> 
> 
> 
> 
> 
> On 01/04/2014 03:21 AM, David Fetter wrote:
> >On Thu, Jan 02, 2014 at 08:48:24PM +0400, knizhnik wrote:
> >>I want to announce implementation of In-Memory Columnar Store
> >>extension for PostgreSQL.
> >>Vertical representation of data is stored in PostgreSQL shared memory.
> >Thanks for the hard work!
> >
> >I noticed a couple of things about this that probably need some
> >improvement.
> >
> >1.  There are unexplained patches against other parts of PostgreSQL,
> >which means that they may break other parts of PostgreSQL in equally
> >inexplicable ways.  Please rearrange the patch so it doesn't require
> >this.  This leads to:
> >
> >2.  The add-on is not formatted as an EXTENSION, which would allow
> >people to add it or remove it cleanly.
> >
> >Would you be so kind as to fix these?
> >
> >Cheers,
> >David.

-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

От

knizhnik

Дата:

04 января 2014 г., 11:11:49

<div class="moz-cite-prefix">On 01/04/2014 12:05 PM, David Fetter wrote:<br /></div><blockquote
cite="mid:20140104080511.GA12040@fetter.org"type="cite"><pre wrap="">I'm sorry I misunderstood about the extension you
wrote.

Is there some way not to use shared memory for it?
</pre></blockquote><br /> No, IMCS ("<font color="#ff0000">In-Memory </font>Columnar Store") is storing data in shared
memory.<br/> Certainly I could allocate shared memory myself, but due to portability and easy maintenance reasons I
decidedto reuse PostgreSQL mechanism of shared memory. The only requirement is that IMSC extension (as well as
pg_stat_statementsextension) should be included in <br /> "shared_preload_libraries" list in postgresql.conf.<br /><br
/>IMCS memory is not somehow interleave with shared memory used for PostgreSQL shared buffers.<br /> And the only
limitationis this 2567Gb limit at Linux, which can be resolved using the patch included in IMCS distributive.<br /><br
/><br/><br /><br /><blockquote cite="mid:20140104080511.GA12040@fetter.org" type="cite"><pre wrap="">
 
Cheers,
David.

On Sat, Jan 04, 2014 at 11:46:25AM +0400, knizhnik wrote:
</pre><blockquote type="cite"><pre wrap="">Hi David,

Sorry, but I do not completely understand your suggestions:

1. IMCS really contains single patch file sysv_shmem.patch.
Applying this patch is not mandatory for using IMCS: it just solves
the problem with support of > 256Gb of shared memory.
Right now PostgreSQL is not able to use more than 256Gb shared
buffers at Linux with standard 4kb pages.
I have found proposal for using MAP_HUGETLB flag in commit fest:

<a class="moz-txt-link-freetext"
href="http://www.postgresql.org/message-id/20131125032920.GA23793@toroid.org">http://www.postgresql.org/message-id/20131125032920.GA23793@toroid.org</a>

but unfortunately it was rejected. Hugepages are intensively used by
Oracle and I think that them will be useful for improving
performance of PorstreSQL. So not just IMCS can benefit from this
patch. My patch  is much more simple - I specially limited scope of
this patch to one file. Certainly switch huge tlb on/off should be
done through postgresql.conf configuration file.

In any case - IMCS can be used without this patch: you just could
not use more than 256Gb memory, even if your system has more RAM.

2. I do not understand "The add-on is not formatted as an EXTENSION"
IMCS was created as standard extension - I just look at the examples
of other PostgreSQL extensions included in PostgreSQL distribution
(for example pg_stat_statements). It can be added using "create
extension imcs" and removed "drop extension imcs" commands.

If there are some violations of PostgreSQL extensions rules, please
let me know, I will fix them.
But I thought that I have done everything in legal way.






On 01/04/2014 03:21 AM, David Fetter wrote:
</pre><blockquote type="cite"><pre wrap="">On Thu, Jan 02, 2014 at 08:48:24PM +0400, knizhnik wrote:
</pre><blockquote type="cite"><pre wrap="">I want to announce implementation of In-Memory Columnar Store
extension for PostgreSQL.
Vertical representation of data is stored in PostgreSQL shared memory.
</pre></blockquote><pre wrap="">Thanks for the hard work!

I noticed a couple of things about this that probably need some
improvement.

1.  There are unexplained patches against other parts of PostgreSQL,
which means that they may break other parts of PostgreSQL in equally
inexplicable ways.  Please rearrange the patch so it doesn't require
this.  This leads to:

2.  The add-on is not formatted as an EXTENSION, which would allow
people to add it or remove it cleanly.

Would you be so kind as to fix these?

Cheers,
David.
</pre></blockquote></blockquote><pre wrap="">
</pre></blockquote><br />

Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

От

Tom Lane

Дата:

04 января 2014 г., 22:11:48

knizhnik <knizhnik@garret.ru> writes:
> On 01/04/2014 12:05 PM, David Fetter wrote:
>> Is there some way not to use shared memory for it?

> No, IMCS ("In-Memory Columnar Store") is storing data in shared memory.

It would probably be better if it made use of the dynamic shared memory
features that exist in HEAD.
        regards, tom lane

Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

От

knizhnik

Дата:

04 января 2014 г., 23:27:31

On 01/04/2014 11:11 PM, Tom Lane wrote:
> knizhnik <knizhnik@garret.ru> writes:
>> On 01/04/2014 12:05 PM, David Fetter wrote:
>>> Is there some way not to use shared memory for it?
>> No, IMCS ("In-Memory Columnar Store") is storing data in shared memory.
> It would probably be better if it made use of the dynamic shared memory
> features that exist in HEAD.
>
>             regards, tom lane

Thank you, I will try it.
But I have some concerns:

1. I want IMCS to work with PostgreSQL versions not supporting DSM 
(dynamic shared memory), like 9.2, 9.3.1,...

2. IMCS is using PostgreSQL hash table implementation (ShmemInitHash, 
hash_search,...)
May be I missed something - I just noticed DSM and have no chance to 
investigate it, but looks like hash table can not be allocated in DSM...

3. IMCS is allocating memory using ShmemAlloc. In case of using DSM I 
have to provide own allocator (although creation of non-releasing memory 
allocator should not be a big issue).

4. Current implementation of DSM still suffers from 256Gb problem. 
Certainly I can create multiple segments and so provide workaround 
without using huge pages, but it complicates allocator.

5. I wonder if I dynamically add new DSM segment - will it be available 
for other PostgreSQL processes? For example I run query which loads data 
in IMCS and so needs more space and allocates new DSM segment. Then 
another query is executed by other PostgreSQL process which tries to 
access this data. This process is not forked from the process created 
this new DSM segment, so I do not understand how this segment will be 
mapped to the address space of this process, preserving address... 
Certainly I can prohibit dynamic extension of IMCS storage (hoping that 
in this case there will be no such problem with DSM). But in this case 
we will loose the main advantage of using DSM instead of old schema of 
plugin's private shared memory.

6. IMCS has some configuration parameters which has to be set through 
postgresql.conf. So in any case user has to edit postgresql.conf file.
In case of using DSM it will be not necessary to add IMCS to 
shared_preload_libraries list. But I do not think that it is so 
restrictive and critical requirement, is it?

Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

От

Robert Haas

Дата:

05 января 2014 г., 19:51:00

On Sat, Jan 4, 2014 at 3:27 PM, knizhnik <knizhnik@garret.ru> wrote:
> 1. I want IMCS to work with PostgreSQL versions not supporting DSM (dynamic
> shared memory), like 9.2, 9.3.1,...

Yeah.  If it's loaded at postmaster start time, then it can work with
any version.  On 9.4+, you could possibly make it work even if it's
loaded on the fly by using the dynamic shared memory facilities.
However, there are currently some limitations to those facilities that
make some things you might want to do tricky.  There are pending
patches to lift some of these limitations.

> 2. IMCS is using PostgreSQL hash table implementation (ShmemInitHash,
> hash_search,...)
> May be I missed something - I just noticed DSM and have no chance to
> investigate it, but looks like hash table can not be allocated in DSM...

It wouldn't be very difficult to write an analog of ShmemInitHash() on
top of the dsm_toc patch that is currently pending.  A problem,
though, is that it's not currently possible to put LWLocks in dynamic
shared memory, and even spinlocks will be problematic if
--disable-spinlocks is used.  I'm due to write a post about these
problems; perhaps I should go do that.

> 3. IMCS is allocating memory using ShmemAlloc. In case of using DSM I have
> to provide own allocator (although creation of non-releasing memory
> allocator should not be a big issue).

The dsm_toc infrastructure would solve this problem.

> 4. Current implementation of DSM still suffers from 256Gb problem. Certainly
> I can create multiple segments and so provide workaround without using huge
> pages, but it complicates allocator.

So it sounds like DSM should also support huge pages somehow.  I'm not
sure what that should look like.

> 5. I wonder if I dynamically add new DSM segment - will it be available for
> other PostgreSQL processes? For example I run query which loads data in IMCS
> and so needs more space and allocates new DSM segment. Then another query is
> executed by other PostgreSQL process which tries to access this data. This
> process is not forked from the process created this new DSM segment, so I do
> not understand how this segment will be mapped to the address space of this
> process, preserving address... Certainly I can prohibit dynamic extension of
> IMCS storage (hoping that in this case there will be no such problem with
> DSM). But in this case we will loose the main advantage of using DSM instead
> of old schema of plugin's private shared memory.

You can definitely dynamically add a new DSM segment; that's the point
of making it *dynamic* shared memory.  What's a bit tricky as things
stand today is making sure that it sticks around.  The current model
is that the DSM segment is destroyed when the last process unmaps it.
It would be easy enough to lift that limitation on systems other than
Windows; we could just add a dsm_keep_until_shutdown() API or
something similar.  But on Windows, segments are *automatically*
destroyed *by the operating system* when the last process unmaps them,
so it's not quite so clear to me how we can allow it there.  The main
shared memory segment is no problem because the postmaster always has
it mapped, even if no one else does, but that doesn't help for dynamic
shared memory segments.

> 6. IMCS has some configuration parameters which has to be set through
> postgresql.conf. So in any case user has to edit postgresql.conf file.
> In case of using DSM it will be not necessary to add IMCS to
> shared_preload_libraries list. But I do not think that it is so restrictive
> and critical requirement, is it?

I don't really see a problem here.  One of the purposes of dynamic
shared memory (and dynamic background workers) is precisely that you
don't *necessarily* need to put extensions that use shared memory in
shared_preload_libraries - or in other words, you can add the
extension to a running server without restarting it.  If you know in
advance that you will want it, you probably still *want* to put it in
shared_preload_libraries, but part of the idea is that we can get away
from requiring that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

От

james

Дата:

05 января 2014 г., 20:34:32

<div class="moz-cite-prefix">On 05/01/2014 16:50, Robert Haas wrote:<br /></div><blockquote
cite="mid:CA+TgmoYPec_Awn+NM-ETnzOwyiYMmH-JaH1-LDOvFDqsFojsTw@mail.gmail.com"type="cite"><pre wrap=""> But on Windows,
segmentsare <b class="moz-txt-star"><span class="moz-txt-tag">*</span>automatically<span
class="moz-txt-tag">*</span></b>
destroyed <b class="moz-txt-star"><span class="moz-txt-tag">*</span>by the operating system<span
class="moz-txt-tag">*</span></b>when the last process unmaps them,
 
so it's not quite so clear to me how we can allow it there.  The main
shared memory segment is no problem because the postmaster always has
it mapped, even if no one else does, but that doesn't help for dynamic
shared memory segments.
</pre></blockquote> Surely you just need to DuplicateHandle into the parent process?  If you<br /> want to (tidily)
disposeof it at some time, then you'll need to tell the<br /> postmaster that you have done so and what the handle is
inits process,<br /> but if you just want it to stick around, then you can just pass it up.<br /><br />

Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

От

Robert Haas

Дата:

05 января 2014 г., 21:02:53

On Sun, Jan 5, 2014 at 12:34 PM, james <james@mansionfamily.plus.com> wrote:
> On 05/01/2014 16:50, Robert Haas wrote:
>
>  But on Windows, segments are *automatically*
> destroyed *by the operating system* when the last process unmaps them,
> so it's not quite so clear to me how we can allow it there.  The main
> shared memory segment is no problem because the postmaster always has
> it mapped, even if no one else does, but that doesn't help for dynamic
> shared memory segments.
>
> Surely you just need to DuplicateHandle into the parent process?  If you
> want to (tidily) dispose of it at some time, then you'll need to tell the
> postmaster that you have done so and what the handle is in its process,
> but if you just want it to stick around, then you can just pass it up.

Uh, I don't know, maybe?  Does the postmaster have to do something to
receive the duplicated handle, or can the child just throw it over the
wall to the parent and let it rot until the postmaster finally exits?
The latter would be nicer for our purposes, perhaps, as running more
code from within the postmaster is risky for us.  If a regular backend
process dies, the postmaster will restart everything and the database
will come back on line, but if the postmaster itself dies, we're hard
down.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

От

knizhnik

Дата:

05 января 2014 г., 21:28:37

 From my point of view it is not a big problem that it is not possible 
to place LWLock in DSM.
I can allocate LWLocks in standard way - using RequestAddinLWLocks and 
use them for synchronization.

Concerning support of huge pages - actually I do not think that it 
should involve something more than just setting MAP_HUGETLB flag.
Allocation of correspondent number of huge pages should be done by 
system administrator.

And what I still do not completely understand - how DSM enforces that 
segment created by one PosatgreSQL process will be mapped to the same 
virtual memory address in all other PostgreSQL processes.
As far as I understand right now (with standard PostgreSQL shared memory 
segments) it is enforced by fork().
Shared memory segments are allocated in one process and all other 
processes are forked from this process inheriting this memory segments.

But if new DSM segment is allocated at during execution of some query, 
then we should add it to virtual space of all PostgreSQL processes. Even 
if we somehow notify them all about presence of new segment, there is 
absolutely no warranty that all of them can map this segment to the 
specified memory address (it can be for some reasons already used by 
some other shared object).
Or may be DSM doesn't guarantee than DSM segment is mapped to the same 
address in all processes?
In this case it significantly complicates DSM usage: it will not be 
possible to use direct pointers.

Can you clarify me please how dynamically allocated DSM segments will be 
shared by all PostgreSQL processes?


On 01/05/2014 08:50 PM, Robert Haas wrote:
> On Sat, Jan 4, 2014 at 3:27 PM, knizhnik <knizhnik@garret.ru> wrote:
>> 1. I want IMCS to work with PostgreSQL versions not supporting DSM (dynamic
>> shared memory), like 9.2, 9.3.1,...
> Yeah.  If it's loaded at postmaster start time, then it can work with
> any version.  On 9.4+, you could possibly make it work even if it's
> loaded on the fly by using the dynamic shared memory facilities.
> However, there are currently some limitations to those facilities that
> make some things you might want to do tricky.  There are pending
> patches to lift some of these limitations.
>
>> 2. IMCS is using PostgreSQL hash table implementation (ShmemInitHash,
>> hash_search,...)
>> May be I missed something - I just noticed DSM and have no chance to
>> investigate it, but looks like hash table can not be allocated in DSM...
> It wouldn't be very difficult to write an analog of ShmemInitHash() on
> top of the dsm_toc patch that is currently pending.  A problem,
> though, is that it's not currently possible to put LWLocks in dynamic
> shared memory, and even spinlocks will be problematic if
> --disable-spinlocks is used.  I'm due to write a post about these
> problems; perhaps I should go do that.
>
>> 3. IMCS is allocating memory using ShmemAlloc. In case of using DSM I have
>> to provide own allocator (although creation of non-releasing memory
>> allocator should not be a big issue).
> The dsm_toc infrastructure would solve this problem.
>
>> 4. Current implementation of DSM still suffers from 256Gb problem. Certainly
>> I can create multiple segments and so provide workaround without using huge
>> pages, but it complicates allocator.
> So it sounds like DSM should also support huge pages somehow.  I'm not
> sure what that should look like.
>
>> 5. I wonder if I dynamically add new DSM segment - will it be available for
>> other PostgreSQL processes? For example I run query which loads data in IMCS
>> and so needs more space and allocates new DSM segment. Then another query is
>> executed by other PostgreSQL process which tries to access this data. This
>> process is not forked from the process created this new DSM segment, so I do
>> not understand how this segment will be mapped to the address space of this
>> process, preserving address... Certainly I can prohibit dynamic extension of
>> IMCS storage (hoping that in this case there will be no such problem with
>> DSM). But in this case we will loose the main advantage of using DSM instead
>> of old schema of plugin's private shared memory.
> You can definitely dynamically add a new DSM segment; that's the point
> of making it *dynamic* shared memory.  What's a bit tricky as things
> stand today is making sure that it sticks around.  The current model
> is that the DSM segment is destroyed when the last process unmaps it.
> It would be easy enough to lift that limitation on systems other than
> Windows; we could just add a dsm_keep_until_shutdown() API or
> something similar.  But on Windows, segments are *automatically*
> destroyed *by the operating system* when the last process unmaps them,
> so it's not quite so clear to me how we can allow it there.  The main
> shared memory segment is no problem because the postmaster always has
> it mapped, even if no one else does, but that doesn't help for dynamic
> shared memory segments.
>
>> 6. IMCS has some configuration parameters which has to be set through
>> postgresql.conf. So in any case user has to edit postgresql.conf file.
>> In case of using DSM it will be not necessary to add IMCS to
>> shared_preload_libraries list. But I do not think that it is so restrictive
>> and critical requirement, is it?
> I don't really see a problem here.  One of the purposes of dynamic
> shared memory (and dynamic background workers) is precisely that you
> don't *necessarily* need to put extensions that use shared memory in
> shared_preload_libraries - or in other words, you can add the
> extension to a running server without restarting it.  If you know in
> advance that you will want it, you probably still *want* to put it in
> shared_preload_libraries, but part of the idea is that we can get away
> from requiring that.
>

Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

От

james

Дата:

05 января 2014 г., 21:44:45

<div class="moz-cite-prefix">On 05/01/2014 18:02, Robert Haas wrote:<br /></div><blockquote
cite="mid:CA+TgmoZ2EL2zdt=e4FONQPGjjn4Y2N=-cu7q+cS+vWRnDgnR+Q@mail.gmail.com"type="cite"><div class="moz-text-plain"
graphical-quote="true"lang="x-western" style="font-family: -moz-fixed; font-size: 14px;" wrap="true"><pre wrap="">On
Sun,Jan 5, 2014 at 12:34 PM, james <a class="moz-txt-link-rfc2396E" href="mailto:james@mansionfamily.plus.com"
moz-do-not-send="true"><james@mansionfamily.plus.com></a>wrote:
 
</pre><blockquote style="color: #000000;" type="cite"><pre wrap=""><span class="moz-txt-citetags">> </span>On
05/01/201416:50, Robert Haas wrote:
 
<span class="moz-txt-citetags">></span>
<span class="moz-txt-citetags">> </span> But on Windows, segments are <b class="moz-txt-star"><span
class="moz-txt-tag">*</span>automatically<spanclass="moz-txt-tag">*</span></b>
 
<span class="moz-txt-citetags">> </span>destroyed <b class="moz-txt-star"><span class="moz-txt-tag">*</span>by the
operatingsystem<span class="moz-txt-tag">*</span></b> when the last process unmaps them,
 
<span class="moz-txt-citetags">> </span>so it's not quite so clear to me how we can allow it there.  The main
<span class="moz-txt-citetags">> </span>shared memory segment is no problem because the postmaster always has
<span class="moz-txt-citetags">> </span>it mapped, even if no one else does, but that doesn't help for dynamic
<span class="moz-txt-citetags">> </span>shared memory segments.
<span class="moz-txt-citetags">></span>
<span class="moz-txt-citetags">> </span>Surely you just need to DuplicateHandle into the parent process?  If you
<span class="moz-txt-citetags">> </span>want to (tidily) dispose of it at some time, then you'll need to tell the
<span class="moz-txt-citetags">> </span>postmaster that you have done so and what the handle is in its process,
<span class="moz-txt-citetags">> </span>but if you just want it to stick around, then you can just pass it up.
</pre></blockquote><pre wrap="">Uh, I don't know, maybe?  Does the postmaster have to do something to
receive the duplicated handle</pre></div></blockquote><br /> In principle, no, so long as the child has a handle to the
parentprocess that has<br /> the appropriate permissions.  Given that these processes have a parent/child<br />
relationshipthat shouldn't be too hard to arrange.<br /><blockquote
cite="mid:CA+TgmoZ2EL2zdt=e4FONQPGjjn4Y2N=-cu7q+cS+vWRnDgnR+Q@mail.gmail.com"type="cite"><div class="moz-text-plain"
graphical-quote="true"lang="x-western" style="font-family: -moz-fixed; font-size: 14px;" wrap="true"><pre wrap="">, or
canthe child just throw it over the
 
wall to the parent and let it rot until the postmaster finally exits?</pre></div></blockquote> Yes.  Though it might be
agood idea to record the handle somewhere (perhaps<br /> in a table) so that any potential issues from an insane system
spammingthe postmaster<br /> with handles are apparent.<br /><br /> I'm intrigued - how are the handles shared between
childrenthat are peers<br /> in the current scheme?  Some handle transfer must already be in place.<br /><br /> Could
youshare the handles to an immortal worker if you want to reduce any<br /> potential impact on the postmaster?<br
/><blockquotecite="mid:CA+TgmoZ2EL2zdt=e4FONQPGjjn4Y2N=-cu7q+cS+vWRnDgnR+Q@mail.gmail.com" type="cite"><div
class="moz-text-plain"graphical-quote="true" lang="x-western" style="font-family: -moz-fixed; font-size: 14px;"
wrap="true"><prewrap="">
 
The latter would be nicer for our purposes, perhaps, as running more
code from within the postmaster is risky for us.  If a regular backend
process dies, the postmaster will restart everything and the database
will come back on line, but if the postmaster itself dies, we're hard
down.

<div class="moz-txt-sig">-- 
Robert Haas
EnterpriseDB: <a class="moz-txt-link-freetext" href="http://www.enterprisedb.com"
moz-do-not-send="true">http://www.enterprisedb.com</a>
The Enterprise PostgreSQL Company
</div></pre></div></blockquote><br />

Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

От

Robert Haas

Дата:

06 января 2014 г., 06:11:37

On Sun, Jan 5, 2014 at 1:28 PM, knizhnik <knizhnik@garret.ru> wrote:
> From my point of view it is not a big problem that it is not possible to
> place LWLock in DSM.
> I can allocate LWLocks in standard way - using RequestAddinLWLocks and use
> them for synchronization.

Sure, well, that works fine if you're being loaded from
shared_preload_libraries.  If you want to be able to load the
extension after startup time, though, it's no good.

> And what I still do not completely understand - how DSM enforces that
> segment created by one PosatgreSQL process will be mapped to the same
> virtual memory address in all other PostgreSQL processes.

It doesn't.  One process calls dsm_create() to create a shared memory
segment.  Other processes call dsm_attach() to attach it.  There's no
guarantee that they'll map it at the same address; they'll just map it
somewhere.

> Or may be DSM doesn't guarantee than DSM segment is mapped to the same
> address in all processes?
> In this case it significantly complicates DSM usage: it will not be possible
> to use direct pointers.

Yeah, that's where we're at.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

От

Robert Haas

Дата:

06 января 2014 г., 06:14:53

On Sun, Jan 5, 2014 at 1:44 PM, james <james@mansionfamily.plus.com> wrote:
> I'm intrigued - how are the handles shared between children that are peers
> in the current scheme?  Some handle transfer must already be in place.

That's up to the application.  After calling dsm_create(), you call
dsm_segment_handle() to get the 32-bit integer handle for that
segment.  Then you have to get that to the other process(es) somehow.
If you're trying to share a handle with a background worker, you can
stuff it in bgw_main_arg.  Otherwise, you'll probably need to store it
in the main shared memory segment, or a file, or whatever.

> Could you share the handles to an immortal worker if you want to reduce any
> potential impact on the postmaster?

You could, but this seems like this justification for spawning another
process, and how immortal is that worker really?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

От

Amit Kapila

Дата:

06 января 2014 г., 07:20:45

On Sun, Jan 5, 2014 at 11:04 PM, james <james@mansionfamily.plus.com> wrote:
> On 05/01/2014 16:50, Robert Haas wrote:
>
>  But on Windows, segments are *automatically*
> destroyed *by the operating system* when the last process unmaps them,
> so it's not quite so clear to me how we can allow it there.  The main
> shared memory segment is no problem because the postmaster always has
> it mapped, even if no one else does, but that doesn't help for dynamic
> shared memory segments.
>
> Surely you just need to DuplicateHandle into the parent process?
  Ideally DuplicateHandle should work, but while going through Windows  internals of shared memory functions on below
link,I observed that  they mentioned it that it will work for child proceess.
http://msdn.microsoft.com/en-us/library/ms810613.aspx Refer section "Inheriting and duplicating memory-mapped file
object handles"
 

>  If you
> want to (tidily) dispose of it at some time, then you'll need to tell the
> postmaster that you have done so and what the handle is in its process,
> but if you just want it to stick around, then you can just pass it up.

Duplicate handle should work, but we need to communicate the handle
to other process using IPC.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

От

james

Дата:

06 января 2014 г., 23:59:28

On 06/01/2014 03:14, Robert Haas wrote:
> That's up to the application.  After calling dsm_create(), you call
> dsm_segment_handle() to get the 32-bit integer handle for that
> segment.  Then you have to get that to the other process(es) somehow.
> If you're trying to share a handle with a background worker, you can
> stuff it in bgw_main_arg.  Otherwise, you'll probably need to store it
> in the main shared memory segment, or a file, or whatever.
Well, that works for sysv shm, sure.  But I was interested (possibly 
from Konstantin)
how the handle transfer takes place at the moment, particularly if it is 
possible
to create additional segments dynamically.  I haven't looked at the 
extension at all.

Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

От

james

Дата:

07 января 2014 г., 00:04:22

On 06/01/2014 04:20, Amit Kapila wrote:
> Duplicate handle should work, but we need to communicate the handle
> to other process using IPC.
Only if the other process needs to use it.  The IPC is not to transfer 
the handle to
the other process, just to tell it which slot in its handle table 
contains the handle.
If you just want to ensure that its use-count never goes to zero, the 
receiver does
not need to know what the handle is.

However ...

The point remains that you need to duplicate it into every process that 
might
want to use it subsequently, so it makes sense to DuplicateHandle into the
parent, and then to advertise that  handle value publicly so that other 
child
processes can DuplicateHandle it back into their own process.

The handle value can change so you also need to refer to the handle in the
parent and map it in each child to the local equivalent.

Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

От

Robert Haas

Дата:

07 января 2014 г., 00:16:44

On Mon, Jan 6, 2014 at 4:04 PM, james <james@mansionfamily.plus.com> wrote:
> The point remains that you need to duplicate it into every process that
> might
> want to use it subsequently, so it makes sense to DuplicateHandle into the
> parent, and then to advertise that  handle value publicly so that other
> child
> processes can DuplicateHandle it back into their own process.

Well, right now we just reopen the same object from all of the
processes, which seems to work fine and doesn't require any of this
complexity.  The only problem I don't know how to solve is how to make
a segment stick around for the whole postmaster lifetime.  If
duplicating the handle into the postmaster without its knowledge gets
us there, it may be worth considering, but that doesn't seem like a
good reason to rework the rest of the existing mechanism.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

От

Amit Kapila

Дата:

08 января 2014 г., 06:21:02

On Tue, Jan 7, 2014 at 2:46 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Mon, Jan 6, 2014 at 4:04 PM, james <james@mansionfamily.plus.com> wrote:
>> The point remains that you need to duplicate it into every process that
>> might
>> want to use it subsequently, so it makes sense to DuplicateHandle into the
>> parent, and then to advertise that  handle value publicly so that other
>> child
>> processes can DuplicateHandle it back into their own process.
>
> Well, right now we just reopen the same object from all of the
> processes, which seems to work fine and doesn't require any of this
> complexity.  The only problem I don't know how to solve is how to make
> a segment stick around for the whole postmaster lifetime.  If
> duplicating the handle into the postmaster without its knowledge gets
> us there, it may be worth considering, but that doesn't seem like a
> good reason to rework the rest of the existing mechanism.

I think one has to try this to see if it works as per the need. If it's not
urgent, I can try this early next week?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

От

Robert Haas

Дата:

08 января 2014 г., 21:51:33

On Tue, Jan 7, 2014 at 10:20 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Tue, Jan 7, 2014 at 2:46 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Mon, Jan 6, 2014 at 4:04 PM, james <james@mansionfamily.plus.com> wrote:
>>> The point remains that you need to duplicate it into every process that
>>> might
>>> want to use it subsequently, so it makes sense to DuplicateHandle into the
>>> parent, and then to advertise that  handle value publicly so that other
>>> child
>>> processes can DuplicateHandle it back into their own process.
>>
>> Well, right now we just reopen the same object from all of the
>> processes, which seems to work fine and doesn't require any of this
>> complexity.  The only problem I don't know how to solve is how to make
>> a segment stick around for the whole postmaster lifetime.  If
>> duplicating the handle into the postmaster without its knowledge gets
>> us there, it may be worth considering, but that doesn't seem like a
>> good reason to rework the rest of the existing mechanism.
>
> I think one has to try this to see if it works as per the need. If it's not
> urgent, I can try this early next week?

Anything we want to get into 9.4 has to be submitted by next Tuesday,
but I don't know that we're going to get this into 9.4.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

От

knizhnik

Дата:

08 января 2014 г., 22:39:19

On 01/08/2014 10:51 PM, Robert Haas wrote:
> On Tue, Jan 7, 2014 at 10:20 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> On Tue, Jan 7, 2014 at 2:46 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>>> On Mon, Jan 6, 2014 at 4:04 PM, james <james@mansionfamily.plus.com> wrote:
>>>> The point remains that you need to duplicate it into every process that
>>>> might
>>>> want to use it subsequently, so it makes sense to DuplicateHandle into the
>>>> parent, and then to advertise that  handle value publicly so that other
>>>> child
>>>> processes can DuplicateHandle it back into their own process.
>>> Well, right now we just reopen the same object from all of the
>>> processes, which seems to work fine and doesn't require any of this
>>> complexity.  The only problem I don't know how to solve is how to make
>>> a segment stick around for the whole postmaster lifetime.  If
>>> duplicating the handle into the postmaster without its knowledge gets
>>> us there, it may be worth considering, but that doesn't seem like a
>>> good reason to rework the rest of the existing mechanism.
>> I think one has to try this to see if it works as per the need. If it's not
>> urgent, I can try this early next week?
> Anything we want to get into 9.4 has to be submitted by next Tuesday,
> but I don't know that we're going to get this into 9.4.
>
I wonder what is the intended use case of dynamic shared memory?
Is is primarly oriented on PostgreSQL extensions or it will be used also 
in PosatgreSQL core?
In case of extensions, shared memory may be needed to store some 
collected/calculated information which will be used by extension functions.

The main advantage of DSM (from my point of view) comparing with existed 
mechanism of preloaded extension is that it is not necessary to restart 
server to add new extension requiring shared memory.
DSM segment can be attached or created by _PG_init function of the 
loaded module.
But there will be not so much sense in this mechanism if this segment 
will be deleted when there are no more processes attached to it.
So to make DSM really useful for extension it needs some mechanism to 
pin segment in memory during all server/extension lifetime.

May be I am wrong, but I do not see some reasons for creating multiple 
DSM segments by the same extension.
And total number of DSM segments is expected to be not very large (<10). 
The same is true for synchronization primitives (LWLocks for example) 
needed to synchronize access to this DSM segments. So I am not sure if 
possibility to place locks in DSM is really so critical...
We can just reserved some space for LWLocks which can be used by 
extension, so that LWLockAssign() can be used without 
RequestAddinLWLocks or RequestAddinLWLocks can be used not only from 
preloaded extension.

IMHO the main trouble with DSM is lack of guarantee that segment is 
always mapped to the same virtual address.
Without such guarantee it is not possible to use direct (normal) 
pointers inside DSM.
But there seems to be no reasonable solution.

Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

От

Robert Haas

Дата:

09 января 2014 г., 20:22:43

On Wed, Jan 8, 2014 at 2:39 PM, knizhnik <knizhnik@garret.ru> wrote:
> I wonder what is the intended use case of dynamic shared memory?
> Is is primarly oriented on PostgreSQL extensions or it will be used also in
> PosatgreSQL core?

My main motivation is that I want to use it to support parallel query.There is unfortunately quite a bit of work left
tobe done before we

can make that a reality, but that's the goal.

> May be I am wrong, but I do not see some reasons for creating multiple DSM
> segments by the same extension.

Right.

> And total number of DSM segments is expected to be not very large (<10). The
> same is true for synchronization primitives (LWLocks for example) needed to
> synchronize access to this DSM segments. So I am not sure if possibility to
> place locks in DSM is really so critical...
> We can just reserved some space for LWLocks which can be used by extension,
> so that LWLockAssign() can be used without RequestAddinLWLocks or
> RequestAddinLWLocks can be used not only from preloaded extension.

If you're doing all of this at postmaster startup time, that all works
fine.  If you want to be able to load up an extension on the fly, then
it doesn't.  You can only RequestAddinLWLocks() at postmaster start
time, not afterwards, so currently any extension that wants to use
lwlocks has to be loaded at postmaster startup time, or you're out of
luck.

Well.  Technically we reserve something like 3 extra lwlocks that
could be assigned later.  But relying on those to be available is not
very reliable, and also, 3 is not very many, considering that we have
something north of 32k core lwlocks in the default configuration.

> IMHO the main trouble with DSM is lack of guarantee that segment is always
> mapped to the same virtual address.
> Without such guarantee it is not possible to use direct (normal) pointers
> inside DSM.
> But there seems to be no reasonable solution.

Yeah, that basically sucks.  But it's very hard to do any better.  At
least on a 64-bit platform, there's an awful lot of address space
available, and in theory it ought to be possible to find a portion of
that address space that isn't in use by any Postgres process and have
all of the backends map the shared memory segment there.  But there's
no portable way to do that, and it seems like it would require an
awful lot of IPC to achieve consensus on where to put a new mapping.

On non-Windows platforms, Noah had the idea that could reserve a large
chunk of address space mapped as PROT_NONE and then overwrite it with
mappings later as needed.  However, I'm not sure how portable that is
or whether it'll cause performance consequences (like page table
bloat) if the space doesn't end up getting used (or if it does).  And
unless you have an awful lot of space available, it's hard to be sure
that new mappings are going to fit.  And then there's Windows.

It would be nice to have better operating system support for this.
For example, IIUC, 64-bit Linux has 128TB of address space available
for user processes.  When you clone(), it can either share the entire
address space (i.e. it's a thread) or none of it (i.e. it's a
process).  There's no option to, say, share 64TB and not the other
64TB, which would be ideal for us.  We could then map dynamic shared
memory segments into the shared portion of the address space and do
backend-private allocations in the unshared part.  Of course, even if
we had that, it wouldn't be portable, so who knows how much good it
would do.  But it would be awfully nice to have the option.

I haven't given up hope that we'll some day find a way to make
same-address mappings work, at least on some platforms.  But I don't
expect it to happen soon.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

От

Claudio Freire

Дата:

09 января 2014 г., 20:46:09

On Thu, Jan 9, 2014 at 2:22 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> It would be nice to have better operating system support for this.
> For example, IIUC, 64-bit Linux has 128TB of address space available
> for user processes.  When you clone(), it can either share the entire
> address space (i.e. it's a thread) or none of it (i.e. it's a
> process).  There's no option to, say, share 64TB and not the other
> 64TB, which would be ideal for us.  We could then map dynamic shared
> memory segments into the shared portion of the address space and do
> backend-private allocations in the unshared part.  Of course, even if
> we had that, it wouldn't be portable, so who knows how much good it
> would do.  But it would be awfully nice to have the option.

You can map a segment at fork time, and unmap it after forking. That
doesn't really use RAM, since it's supposed to be lazily allocated (it
can be forced to be so, I believe, with PROT_NONE and MAP_NORESERVE,
but I don't think that's portable).

That guarantees it's free.

Next, you can map shared memory at explicit addresses (linux's mmap
has support for that, and I seem to recall Windows did too).

All you have to do, is some book-keeping in shared memory (so all
processes can coordinate new mappings).

Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

От

Amit Kapila

Дата:

09 января 2014 г., 22:09:35

On Thu, Jan 9, 2014 at 12:21 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Jan 7, 2014 at 10:20 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> On Tue, Jan 7, 2014 at 2:46 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>>>
>>> Well, right now we just reopen the same object from all of the
>>> processes, which seems to work fine and doesn't require any of this
>>> complexity.  The only problem I don't know how to solve is how to make
>>> a segment stick around for the whole postmaster lifetime.  If
>>> duplicating the handle into the postmaster without its knowledge gets
>>> us there, it may be worth considering, but that doesn't seem like a
>>> good reason to rework the rest of the existing mechanism.
>>
>> I think one has to try this to see if it works as per the need. If it's not
>> urgent, I can try this early next week?
>
> Anything we want to get into 9.4 has to be submitted by next Tuesday,
> but I don't know that we're going to get this into 9.4.

Using DuplicateHandle(), we can make segment stick for Postmaster
lifetime. I have used below test (used dsm_demo module) to verify:
Session - 1
select dsm_demo_create('this message is from session-1');dsm_demo_create
-----------------      827121111

Session - 2
-----------------
select dsm_demo_read(827121111);      dsm_demo_read
----------------------------this message is from session-1
(1 row)

Session-1
\q

-- till here it will work without DuplicateHandle as well

Session -2
select dsm_demo_read(827121111);      dsm_demo_read
----------------------------this message is from session-1
(1 row)

Session -2
\q

Session -3
select dsm_demo_read(827121111);      dsm_demo_read
----------------------------this message is from session-1
(1 row)

-- above shows that handle stays around.

Note -
Currently I have to bypass below code in dam_attach(), as it assumes
segment will not stay if it's removed from control file.

/*
* If we didn't find the handle we're looking for in the control
* segment, it probably means that everyone else who had it mapped,
* including the original creator, died before we got to this point.
* It's up to the caller to decide what to do about that.
*/
if (seg->control_slot == INVALID_CONTROL_SLOT)
{
dsm_detach(seg);
return NULL;
}


Could you let me know what exactly you are expecting in patch,
just a call to DuplicateHandle() after CreateFileMapping() or something
else as well?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

От

knizhnik

Дата:

09 января 2014 г., 22:18:56

On 01/09/2014 09:22 PM, Robert Haas wrote:
> On Wed, Jan 8, 2014 at 2:39 PM, knizhnik <knizhnik@garret.ru> wrote:
>> I wonder what is the intended use case of dynamic shared memory?
>> Is is primarly oriented on PostgreSQL extensions or it will be used also in
>> PosatgreSQL core?
> My main motivation is that I want to use it to support parallel query.
>   There is unfortunately quite a bit of work left to be done before we
> can make that a reality, but that's the goal.

I do not want to waste your time, but this topic is very interesting to 
me and I will be very pleased if you drop few words about how DSM can 
help to implement parallel query processing?
It seems to me that the main complexity is in optimizer - it needs to 
split query plan into several subplans which can be executed 
concurrently and then merge their partial results.
As far as I understand it is not possible to use multithreading for 
parallel query execution because most of PostgreSQL code is 
non-reentrant. So we need to execute this subplans by several processes. 
And unlike threads, the only way of efficient exchanging data between 
processes is shared memory. So it is clear why do we need shared memory 
for parallel query execution. But why it has to be dynamic? Why it can 
not be preallocated at start time as most of other resources used by 
PostgreSQL?

>
>> May be I am wrong, but I do not see some reasons for creating multiple DSM
>> segments by the same extension.
> Right.
>
>> And total number of DSM segments is expected to be not very large (<10). The
>> same is true for synchronization primitives (LWLocks for example) needed to
>> synchronize access to this DSM segments. So I am not sure if possibility to
>> place locks in DSM is really so critical...
>> We can just reserved some space for LWLocks which can be used by extension,
>> so that LWLockAssign() can be used without RequestAddinLWLocks or
>> RequestAddinLWLocks can be used not only from preloaded extension.
> If you're doing all of this at postmaster startup time, that all works
> fine.  If you want to be able to load up an extension on the fly, then
> it doesn't.  You can only RequestAddinLWLocks() at postmaster start
> time, not afterwards, so currently any extension that wants to use
> lwlocks has to be loaded at postmaster startup time, or you're out of
> luck.
>
> Well.  Technically we reserve something like 3 extra lwlocks that
> could be assigned later.  But relying on those to be available is not
> very reliable, and also, 3 is not very many, considering that we have
> something north of 32k core lwlocks in the default configuration.

3 is definitely too small.
But you agreed with me that number of DSM segments will be not very large.
And if we do not need fine grain locking (and IMHO it is not needed for 
most extensions), then we need just few (most likely one) lock per DSM 
segment.
It means that if instead of 3 we reserve let's say 30 LW-locks, then it 
will be enough for most extensions. And there will be almost now extra 
resources overhead, because as you wrote PostgreSQL has 32k locks in 
default configuration.

Certainly if we need independent lock for each page of DSM memory than 
there will be no other choice except placing locks in DSM segment 
itself. But once again - I do not think that most of extension needed 
shared memory will use such fine grain locking.

Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

От

knizhnik

Дата:

09 января 2014 г., 22:25:17

On 01/09/2014 09:46 PM, Claudio Freire wrote:
> On Thu, Jan 9, 2014 at 2:22 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> It would be nice to have better operating system support for this.
>> For example, IIUC, 64-bit Linux has 128TB of address space available
>> for user processes.  When you clone(), it can either share the entire
>> address space (i.e. it's a thread) or none of it (i.e. it's a
>> process).  There's no option to, say, share 64TB and not the other
>> 64TB, which would be ideal for us.  We could then map dynamic shared
>> memory segments into the shared portion of the address space and do
>> backend-private allocations in the unshared part.  Of course, even if
>> we had that, it wouldn't be portable, so who knows how much good it
>> would do.  But it would be awfully nice to have the option.
> You can map a segment at fork time, and unmap it after forking. That
> doesn't really use RAM, since it's supposed to be lazily allocated (it
> can be forced to be so, I believe, with PROT_NONE and MAP_NORESERVE,
> but I don't think that's portable).
>
> That guarantees it's free.
>
> Next, you can map shared memory at explicit addresses (linux's mmap
> has support for that, and I seem to recall Windows did too).
>
> All you have to do, is some book-keeping in shared memory (so all
> processes can coordinate new mappings).
As far as I undersand the main advantage of DSM is that segment can be 
allocated at any time - not only at fork time.
And it is not because of memory consumption: even without unmap, 
allocation of some memory region doesn't cause loose pg physical memory. 
And there are usually no problem with exhaustion of virtual space at 
64-bit architecture. But using some combination of flags (as 
MAP_NORESERVE), it is usually possible to completely eliminate overhead 
of reserving some address range in virtual space. But mapping 
dynamically created segment (not at fork time) to the same address 
really seems to be a big challenge.

Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

От

Claudio Freire

Дата:

09 января 2014 г., 22:30:45

On Thu, Jan 9, 2014 at 4:24 PM, knizhnik <knizhnik@garret.ru> wrote:
> On 01/09/2014 09:46 PM, Claudio Freire wrote:
>>
>> On Thu, Jan 9, 2014 at 2:22 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>>
>>> It would be nice to have better operating system support for this.
>>> For example, IIUC, 64-bit Linux has 128TB of address space available
>>> for user processes.  When you clone(), it can either share the entire
>>> address space (i.e. it's a thread) or none of it (i.e. it's a
>>> process).  There's no option to, say, share 64TB and not the other
>>> 64TB, which would be ideal for us.  We could then map dynamic shared
>>> memory segments into the shared portion of the address space and do
>>> backend-private allocations in the unshared part.  Of course, even if
>>> we had that, it wouldn't be portable, so who knows how much good it
>>> would do.  But it would be awfully nice to have the option.
>>
>> You can map a segment at fork time, and unmap it after forking. That
>> doesn't really use RAM, since it's supposed to be lazily allocated (it
>> can be forced to be so, I believe, with PROT_NONE and MAP_NORESERVE,
>> but I don't think that's portable).
>>
>> That guarantees it's free.
>>
>> Next, you can map shared memory at explicit addresses (linux's mmap
>> has support for that, and I seem to recall Windows did too).
>>
>> All you have to do, is some book-keeping in shared memory (so all
>> processes can coordinate new mappings).
>
> As far as I undersand the main advantage of DSM is that segment can be
> allocated at any time - not only at fork time.
> And it is not because of memory consumption: even without unmap, allocation
> of some memory region doesn't cause loose pg physical memory. And there are
> usually no problem with exhaustion of virtual space at 64-bit architecture.
> But using some combination of flags (as MAP_NORESERVE), it is usually
> possible to completely eliminate overhead of reserving some address range in
> virtual space. But mapping dynamically created segment (not at fork time) to
> the same address really seems to be a big challenge.

At fork time I only wrote about reserving the address space. After
reserving it, all you have to do is implement an allocator that works
in shared memory (protected by a lwlock of course).

In essence, a hypothetical pg_dsm_alloc(region_name) would use regular
shared memory to coordinate returning an already mapped region (same
address which is guaranteed to work since we reserved that region), or
allocate one (within the reserved address space).

Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

От

knizhnik

Дата:

09 января 2014 г., 22:31:28

On 01/09/2014 11:09 PM, Amit Kapila wrote:
> On Thu, Jan 9, 2014 at 12:21 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Tue, Jan 7, 2014 at 10:20 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>> On Tue, Jan 7, 2014 at 2:46 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>>>> Well, right now we just reopen the same object from all of the
>>>> processes, which seems to work fine and doesn't require any of this
>>>> complexity.  The only problem I don't know how to solve is how to make
>>>> a segment stick around for the whole postmaster lifetime.  If
>>>> duplicating the handle into the postmaster without its knowledge gets
>>>> us there, it may be worth considering, but that doesn't seem like a
>>>> good reason to rework the rest of the existing mechanism.
>>> I think one has to try this to see if it works as per the need. If it's not
>>> urgent, I can try this early next week?
>> Anything we want to get into 9.4 has to be submitted by next Tuesday,
>> but I don't know that we're going to get this into 9.4.
> Using DuplicateHandle(), we can make segment stick for Postmaster
> lifetime. I have used below test (used dsm_demo module) to verify:
> Session - 1
> select dsm_demo_create('this message is from session-1');
>   dsm_demo_create
> -----------------
>         827121111
>
> Session - 2
> -----------------
> select dsm_demo_read(827121111);
>         dsm_demo_read
> ----------------------------
>   this message is from session-1
> (1 row)
>
> Session-1
> \q
>
> -- till here it will work without DuplicateHandle as well
>
> Session -2
> select dsm_demo_read(827121111);
>         dsm_demo_read
> ----------------------------
>   this message is from session-1
> (1 row)
>
> Session -2
> \q
>
> Session -3
> select dsm_demo_read(827121111);
>         dsm_demo_read
> ----------------------------
>   this message is from session-1
> (1 row)
>
> -- above shows that handle stays around.
>
> Note -
> Currently I have to bypass below code in dam_attach(), as it assumes
> segment will not stay if it's removed from control file.
>
> /*
> * If we didn't find the handle we're looking for in the control
> * segment, it probably means that everyone else who had it mapped,
> * including the original creator, died before we got to this point.
> * It's up to the caller to decide what to do about that.
> */
> if (seg->control_slot == INVALID_CONTROL_SLOT)
> {
> dsm_detach(seg);
> return NULL;
> }
>
>
> Could you let me know what exactly you are expecting in patch,
> just a call to DuplicateHandle() after CreateFileMapping() or something
> else as well?

As far as I understand DuplicateHandle() should really do the trick: 
protect segment from deallocation.
But should postmaster be somehow notified about this handle?
For example, if we really wants to delete this segment (drop extension), 
we should somehow make Postmaster to close this handle.
How it can be done?

Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

От

Amit Kapila

Дата:

09 января 2014 г., 22:36:53

On Fri, Jan 10, 2014 at 1:00 AM, knizhnik <knizhnik@garret.ru> wrote:
> On 01/09/2014 11:09 PM, Amit Kapila wrote:
>>
>>
>> Using DuplicateHandle(), we can make segment stick for Postmaster
>> lifetime. I have used below test (used dsm_demo module) to verify:
>
> As far as I understand DuplicateHandle() should really do the trick: protect
> segment from deallocation.
> But should postmaster be somehow notified about this handle?
> For example, if we really wants to delete this segment (drop extension), we
> should somehow make Postmaster to close this handle.
> How it can be done?

I think we need to use some form of IPC to communicate it to Postmaster.
I could not think of any other way atm.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

От

knizhnik

Дата:

09 января 2014 г., 22:39:54

On 01/09/2014 11:30 PM, Claudio Freire wrote:
> On Thu, Jan 9, 2014 at 4:24 PM, knizhnik <knizhnik@garret.ru> wrote:
>> On 01/09/2014 09:46 PM, Claudio Freire wrote:
>>> On Thu, Jan 9, 2014 at 2:22 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>>> It would be nice to have better operating system support for this.
>>>> For example, IIUC, 64-bit Linux has 128TB of address space available
>>>> for user processes.  When you clone(), it can either share the entire
>>>> address space (i.e. it's a thread) or none of it (i.e. it's a
>>>> process).  There's no option to, say, share 64TB and not the other
>>>> 64TB, which would be ideal for us.  We could then map dynamic shared
>>>> memory segments into the shared portion of the address space and do
>>>> backend-private allocations in the unshared part.  Of course, even if
>>>> we had that, it wouldn't be portable, so who knows how much good it
>>>> would do.  But it would be awfully nice to have the option.
>>> You can map a segment at fork time, and unmap it after forking. That
>>> doesn't really use RAM, since it's supposed to be lazily allocated (it
>>> can be forced to be so, I believe, with PROT_NONE and MAP_NORESERVE,
>>> but I don't think that's portable).
>>>
>>> That guarantees it's free.
>>>
>>> Next, you can map shared memory at explicit addresses (linux's mmap
>>> has support for that, and I seem to recall Windows did too).
>>>
>>> All you have to do, is some book-keeping in shared memory (so all
>>> processes can coordinate new mappings).
>> As far as I undersand the main advantage of DSM is that segment can be
>> allocated at any time - not only at fork time.
>> And it is not because of memory consumption: even without unmap, allocation
>> of some memory region doesn't cause loose pg physical memory. And there are
>> usually no problem with exhaustion of virtual space at 64-bit architecture.
>> But using some combination of flags (as MAP_NORESERVE), it is usually
>> possible to completely eliminate overhead of reserving some address range in
>> virtual space. But mapping dynamically created segment (not at fork time) to
>> the same address really seems to be a big challenge.
> At fork time I only wrote about reserving the address space. After
> reserving it, all you have to do is implement an allocator that works
> in shared memory (protected by a lwlock of course).
>
> In essence, a hypothetical pg_dsm_alloc(region_name) would use regular
> shared memory to coordinate returning an already mapped region (same
> address which is guaranteed to work since we reserved that region), or
> allocate one (within the reserved address space).
Why do we need named segments? There is ShmemAlloc function in 
PostgreSQL API.
If RequestAddinShmemSpace can be used without requirement to place 
module in preloaded list, then isn't it enough for most extensions?
And ShmemInitHash can be used to maintain named regions if it is needed...

So if we have some reserved address space, do we actually need some 
special allocator for this space to allocate new segments in it?
Why existed API to shared memory is not enough?

Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

От

Claudio Freire

Дата:

09 января 2014 г., 22:48:43

On Thu, Jan 9, 2014 at 4:39 PM, knizhnik <knizhnik@garret.ru> wrote:
>> At fork time I only wrote about reserving the address space. After
>> reserving it, all you have to do is implement an allocator that works
>> in shared memory (protected by a lwlock of course).
>>
>> In essence, a hypothetical pg_dsm_alloc(region_name) would use regular
>> shared memory to coordinate returning an already mapped region (same
>> address which is guaranteed to work since we reserved that region), or
>> allocate one (within the reserved address space).
>
> Why do we need named segments? There is ShmemAlloc function in PostgreSQL
> API.
> If RequestAddinShmemSpace can be used without requirement to place module in
> preloaded list, then isn't it enough for most extensions?
> And ShmemInitHash can be used to maintain named regions if it is needed...

If you want to dynamically create the segments, you need some way to
identify them. That is, the name. Otherwise, RequestWhateverShmemSpace
won't know when to return an already-mapped region or not.

Mind you, the name can be a number. No need to make it a string.

> So if we have some reserved address space, do we actually need some special
> allocator for this space to allocate new segments in it?
> Why existed API to shared memory is not enough?

I don't know this existing API you mention. But I think this is quite
a specific case very unlikely to be serviced from existing APIs. You
need a data structure that can map names to regions, any hash map will
do, or even an array since one wouldn't expect it to be too big, or
require it to be too fast, and then you need to unmap the "reserve"
mapping and put a shared region there instead, before returning the
pointer to this shared region.

So, the special thing is, the book-keeping region sits in regular
shared memory, whereas the allocated regions sit in newly-created
segments. And segments are referenced by pointers (since the address
space is fixed and shared). Is there something like that already?

Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

От

Claudio Freire

Дата:

09 января 2014 г., 22:50:40

On Thu, Jan 9, 2014 at 4:48 PM, Claudio Freire <klaussfreire@gmail.com> wrote:
> On Thu, Jan 9, 2014 at 4:39 PM, knizhnik <knizhnik@garret.ru> wrote:
>>> At fork time I only wrote about reserving the address space. After
>>> reserving it, all you have to do is implement an allocator that works
>>> in shared memory (protected by a lwlock of course).
>>>
>>> In essence, a hypothetical pg_dsm_alloc(region_name) would use regular
>>> shared memory to coordinate returning an already mapped region (same
>>> address which is guaranteed to work since we reserved that region), or
>>> allocate one (within the reserved address space).
>>
>> Why do we need named segments? There is ShmemAlloc function in PostgreSQL
>> API.
>> If RequestAddinShmemSpace can be used without requirement to place module in
>> preloaded list, then isn't it enough for most extensions?
>> And ShmemInitHash can be used to maintain named regions if it is needed...
>
> If you want to dynamically create the segments, you need some way to
> identify them. That is, the name. Otherwise, RequestWhateverShmemSpace
> won't know when to return an already-mapped region or not.
>
> Mind you, the name can be a number. No need to make it a string.
>
>> So if we have some reserved address space, do we actually need some special
>> allocator for this space to allocate new segments in it?
>> Why existed API to shared memory is not enough?


Oh, I notice why the confusion now.

The "reserve" mapping I was proposing, was a MAP_NORESERVE with PROT_NONE.

Ie: forbidden access. Which guarantees the OS won't try to allocate
physical RAM to it.

You'd have to re-map it before using, so it's not like a regular
shared memory region where you can simply allocate pointers and
intersperse bookkeeping data in-place.

Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

От

knizhnik

Дата:

09 января 2014 г., 23:19:17

On 01/09/2014 11:48 PM, Claudio Freire wrote:
> On Thu, Jan 9, 2014 at 4:39 PM, knizhnik <knizhnik@garret.ru> wrote:
>>> At fork time I only wrote about reserving the address space. After
>>> reserving it, all you have to do is implement an allocator that works
>>> in shared memory (protected by a lwlock of course).
>>>
>>> In essence, a hypothetical pg_dsm_alloc(region_name) would use regular
>>> shared memory to coordinate returning an already mapped region (same
>>> address which is guaranteed to work since we reserved that region), or
>>> allocate one (within the reserved address space).
>> Why do we need named segments? There is ShmemAlloc function in PostgreSQL
>> API.
>> If RequestAddinShmemSpace can be used without requirement to place module in
>> preloaded list, then isn't it enough for most extensions?
>> And ShmemInitHash can be used to maintain named regions if it is needed...
> If you want to dynamically create the segments, you need some way to
> identify them. That is, the name. Otherwise, RequestWhateverShmemSpace
> won't know when to return an already-mapped region or not.
>
> Mind you, the name can be a number. No need to make it a string.
>
>> So if we have some reserved address space, do we actually need some special
>> allocator for this space to allocate new segments in it?
>> Why existed API to shared memory is not enough?
> I don't know this existing API you mention. But I think this is quite
> a specific case very unlikely to be serviced from existing APIs. You
> need a data structure that can map names to regions, any hash map will
> do, or even an array since one wouldn't expect it to be too big, or
> require it to be too fast, and then you need to unmap the "reserve"
> mapping and put a shared region there instead, before returning the
> pointer to this shared region.
>
> So, the special thing is, the book-keeping region sits in regular
> shared memory, whereas the allocated regions sit in newly-created
> segments. And segments are referenced by pointers (since the address
> space is fixed and shared). Is there something like that already?
By existed API I mostly mean 6 functions:

RequestAddinShmemSpace()
RequestAddinLWLocks()
ShmemInitStruct()
LWLockAssign()
ShmemAlloc()
ShmemInitHash()

If it will be possible to use this function without requirement for 
module to be included in "shared_preload_libraries" list, then do we 
really need DSM?
And it can be achieved by
1. Preserving address space (as you suggested)
2. Preserving some fixed number of free LWLocks (not very large < 100).

I do not have something against creation of own allocator of named 
shared memory segments within preserved address space.
I just not sure if it is actually needed. In some sense 
RequestAddinShmemSpace() can be such allocator.

Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

От

Jim Nasby

Дата:

10 января 2014 г., 00:05:00

On 1/9/14, 1:18 PM, knizhnik wrote:
> So it is clear why do we need shared memory for parallel query execution. But why it has to be dynamic? Why it can
notbe preallocated at start time as most of other resources used by PostgreSQL?
 

That would limit us to doing something like allocating a fixed maximum of parallel processes (which might be workable)
andonly allocating a very small amount of memory for IPC. Small as in can only handle a small number of tuples. That
soundslike a really inefficient way to shuffle data to and from parallel processes, especially because one or both
sideswould probably have to actually copy the data if we're doing it that way.
 

With DSM if you want to do something like a parallel sort each process can put their results into memory that the
parentprocess can directly access.
 

Of course the other enormous win for DSM is it's the foundation for finally being able to resize things without a
restart.For large dollar sites that ability would be hugely beneficial.
 
-- 
Jim C. Nasby, Data Architect                       jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net

Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

От

Robert Haas

Дата:

10 января 2014 г., 21:23:25

On Thu, Jan 9, 2014 at 12:46 PM, Claudio Freire <klaussfreire@gmail.com> wrote:
> On Thu, Jan 9, 2014 at 2:22 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> It would be nice to have better operating system support for this.
>> For example, IIUC, 64-bit Linux has 128TB of address space available
>> for user processes.  When you clone(), it can either share the entire
>> address space (i.e. it's a thread) or none of it (i.e. it's a
>> process).  There's no option to, say, share 64TB and not the other
>> 64TB, which would be ideal for us.  We could then map dynamic shared
>> memory segments into the shared portion of the address space and do
>> backend-private allocations in the unshared part.  Of course, even if
>> we had that, it wouldn't be portable, so who knows how much good it
>> would do.  But it would be awfully nice to have the option.
>
> You can map a segment at fork time, and unmap it after forking. That
> doesn't really use RAM, since it's supposed to be lazily allocated (it
> can be forced to be so, I believe, with PROT_NONE and MAP_NORESERVE,
> but I don't think that's portable).
>
> That guarantees it's free.

It guarantees that it is free as of the moment you unmap it, but it
doesn't guarantee that future memory allocations or shared library
loads couldn't stomp on the space.

Also, that not-portable thing is a bit of a problem.  I've got no
problem with the idea that third-party code may be platform-specific,
but I think the stuff we ship in core has got to work on more or less
all reasonably modern systems.

> Next, you can map shared memory at explicit addresses (linux's mmap
> has support for that, and I seem to recall Windows did too).
>
> All you have to do, is some book-keeping in shared memory (so all
> processes can coordinate new mappings).

I did something like this back in 1998 or 1999 at the operating system
level, and it turned out not to work very well.  I was working on an
experimental research operating system kernel, and we wanted to add
support for mmap(), so we set aside a portion of the virtual address
space for file mappings.  That region was shared across all processes
in the system.  One problem is that there's no guarantee the space is
big enough for whatever you want to map; and the other problem is that
it can easily get fragmented.  Now, 64-bit address spaces go some way
to ameliorating these concerns so maybe it can be made to work, but I
would be a teeny bit cautious about using the word "just" to describe
the complexity involved.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

От

Robert Haas

Дата:

10 января 2014 г., 21:25:30

On Thu, Jan 9, 2014 at 2:09 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Thu, Jan 9, 2014 at 12:21 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Tue, Jan 7, 2014 at 10:20 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>> On Tue, Jan 7, 2014 at 2:46 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>>>>
>>>> Well, right now we just reopen the same object from all of the
>>>> processes, which seems to work fine and doesn't require any of this
>>>> complexity.  The only problem I don't know how to solve is how to make
>>>> a segment stick around for the whole postmaster lifetime.  If
>>>> duplicating the handle into the postmaster without its knowledge gets
>>>> us there, it may be worth considering, but that doesn't seem like a
>>>> good reason to rework the rest of the existing mechanism.
>>>
>>> I think one has to try this to see if it works as per the need. If it's not
>>> urgent, I can try this early next week?
>>
>> Anything we want to get into 9.4 has to be submitted by next Tuesday,
>> but I don't know that we're going to get this into 9.4.
>
> Using DuplicateHandle(), we can make segment stick for Postmaster
> lifetime. I have used below test (used dsm_demo module) to verify:
> Session - 1
> select dsm_demo_create('this message is from session-1');
>  dsm_demo_create
> -----------------
>        827121111
>
> Session - 2
> -----------------
> select dsm_demo_read(827121111);
>        dsm_demo_read
> ----------------------------
>  this message is from session-1
> (1 row)
>
> Session-1
> \q
>
> -- till here it will work without DuplicateHandle as well
>
> Session -2
> select dsm_demo_read(827121111);
>        dsm_demo_read
> ----------------------------
>  this message is from session-1
> (1 row)
>
> Session -2
> \q
>
> Session -3
> select dsm_demo_read(827121111);
>        dsm_demo_read
> ----------------------------
>  this message is from session-1
> (1 row)
>
> -- above shows that handle stays around.
>
> Note -
> Currently I have to bypass below code in dam_attach(), as it assumes
> segment will not stay if it's removed from control file.
>
> /*
> * If we didn't find the handle we're looking for in the control
> * segment, it probably means that everyone else who had it mapped,
> * including the original creator, died before we got to this point.
> * It's up to the caller to decide what to do about that.
> */
> if (seg->control_slot == INVALID_CONTROL_SLOT)
> {
> dsm_detach(seg);
> return NULL;
> }
>
>
> Could you let me know what exactly you are expecting in patch,
> just a call to DuplicateHandle() after CreateFileMapping() or something
> else as well?

Well, I guess what I was thinking is that we could have a call
dsm_keep_segment() which would be invoked on an already-created
dsm_segment *.  On Linux, that would just bump the reference count in
the control segment up by one so that it doesn't get destroyed until
postmaster shutdown.  On Windows it may as well still do that for
consistency, but will also need to do this DuplicateHandle() trick.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

От

Claudio Freire

Дата:

10 января 2014 г., 21:35:39

On Fri, Jan 10, 2014 at 3:23 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, Jan 9, 2014 at 12:46 PM, Claudio Freire <klaussfreire@gmail.com> wrote:
>> On Thu, Jan 9, 2014 at 2:22 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>> It would be nice to have better operating system support for this.
>>> For example, IIUC, 64-bit Linux has 128TB of address space available
>>> for user processes.  When you clone(), it can either share the entire
>>> address space (i.e. it's a thread) or none of it (i.e. it's a
>>> process).  There's no option to, say, share 64TB and not the other
>>> 64TB, which would be ideal for us.  We could then map dynamic shared
>>> memory segments into the shared portion of the address space and do
>>> backend-private allocations in the unshared part.  Of course, even if
>>> we had that, it wouldn't be portable, so who knows how much good it
>>> would do.  But it would be awfully nice to have the option.
>>
>> You can map a segment at fork time, and unmap it after forking. That
>> doesn't really use RAM, since it's supposed to be lazily allocated (it
>> can be forced to be so, I believe, with PROT_NONE and MAP_NORESERVE,
>> but I don't think that's portable).
>>
>> That guarantees it's free.
>
> It guarantees that it is free as of the moment you unmap it, but it
> doesn't guarantee that future memory allocations or shared library
> loads couldn't stomp on the space.

You would only unmap prior to remapping, only the to-be-mapped
portion, so I don't see a problem.

> Also, that not-portable thing is a bit of a problem.  I've got no
> problem with the idea that third-party code may be platform-specific,
> but I think the stuff we ship in core has got to work on more or less
> all reasonably modern systems.
>
>> Next, you can map shared memory at explicit addresses (linux's mmap
>> has support for that, and I seem to recall Windows did too).
>>
>> All you have to do, is some book-keeping in shared memory (so all
>> processes can coordinate new mappings).
>
> I did something like this back in 1998 or 1999 at the operating system
> level, and it turned out not to work very well.  I was working on an
> experimental research operating system kernel, and we wanted to add
> support for mmap(), so we set aside a portion of the virtual address
> space for file mappings.  That region was shared across all processes
> in the system.  One problem is that there's no guarantee the space is
> big enough for whatever you want to map; and the other problem is that
> it can easily get fragmented.  Now, 64-bit address spaces go some way
> to ameliorating these concerns so maybe it can be made to work, but I
> would be a teeny bit cautious about using the word "just" to describe
> the complexity involved.

Ok, yes, fragmentation could be an issue if the address range is not
"humongus enough".

Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

От

Robert Haas

Дата:

10 января 2014 г., 21:51:12

On Fri, Jan 10, 2014 at 1:35 PM, Claudio Freire <klaussfreire@gmail.com> wrote:
>>> You can map a segment at fork time, and unmap it after forking. That
>>> doesn't really use RAM, since it's supposed to be lazily allocated (it
>>> can be forced to be so, I believe, with PROT_NONE and MAP_NORESERVE,
>>> but I don't think that's portable).
>>>
>>> That guarantees it's free.
>>
>> It guarantees that it is free as of the moment you unmap it, but it
>> doesn't guarantee that future memory allocations or shared library
>> loads couldn't stomp on the space.
>
> You would only unmap prior to remapping, only the to-be-mapped
> portion, so I don't see a problem.

OK, yeah, that way works.  That's more or less what Noah proposed
before.  But I was skeptical it would work well everywhere.  I suppose
we won't know until somebody tries it.  (I didn't.)

> Ok, yes, fragmentation could be an issue if the address range is not
> "humongus enough".

I've often thought that 64-bit machines are so capable that there's no
reason to go any higher.  But lately I've started to wonder.  There
are already machines out there with >2^40 bytes of physical memory,
and the number just keeps creeping up.  When you reserve a couple of
bits to indicate user or kernel space, and then consider that virtual
address space can be many times larger than physical memory, it starts
not to seem like that much.

But I'm not that excited about the amount of additional memory we'll
eat when somebody decides to make a pointer 16 bytes.  Ugh.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

От

Tom Lane

Дата:

10 января 2014 г., 22:02:22

Robert Haas <robertmhaas@gmail.com> writes:
> I've often thought that 64-bit machines are so capable that there's no
> reason to go any higher.  But lately I've started to wonder.  There
> are already machines out there with >2^40 bytes of physical memory,
> and the number just keeps creeping up.  When you reserve a couple of
> bits to indicate user or kernel space, and then consider that virtual
> address space can be many times larger than physical memory, it starts
> not to seem like that much.

> But I'm not that excited about the amount of additional memory we'll
> eat when somebody decides to make a pointer 16 bytes.  Ugh.

Once you really need that, you're not going to care about doubling
the size of pointers.  At worst, you're giving up 1 bit of address
space to gain 64 more.

(Still, I rather doubt it'll happen in my lifetime.)
        regards, tom lane

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL