Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL
От | Robert Haas |
---|---|
Тема | Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL |
Дата | |
Msg-id | CA+TgmoYPec_Awn+NM-ETnzOwyiYMmH-JaH1-LDOvFDqsFojsTw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL (knizhnik <knizhnik@garret.ru>) |
Ответы |
Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL
(james <james@mansionfamily.plus.com>)
Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL (knizhnik <knizhnik@garret.ru>) |
Список | pgsql-hackers |
On Sat, Jan 4, 2014 at 3:27 PM, knizhnik <knizhnik@garret.ru> wrote: > 1. I want IMCS to work with PostgreSQL versions not supporting DSM (dynamic > shared memory), like 9.2, 9.3.1,... Yeah. If it's loaded at postmaster start time, then it can work with any version. On 9.4+, you could possibly make it work even if it's loaded on the fly by using the dynamic shared memory facilities. However, there are currently some limitations to those facilities that make some things you might want to do tricky. There are pending patches to lift some of these limitations. > 2. IMCS is using PostgreSQL hash table implementation (ShmemInitHash, > hash_search,...) > May be I missed something - I just noticed DSM and have no chance to > investigate it, but looks like hash table can not be allocated in DSM... It wouldn't be very difficult to write an analog of ShmemInitHash() on top of the dsm_toc patch that is currently pending. A problem, though, is that it's not currently possible to put LWLocks in dynamic shared memory, and even spinlocks will be problematic if --disable-spinlocks is used. I'm due to write a post about these problems; perhaps I should go do that. > 3. IMCS is allocating memory using ShmemAlloc. In case of using DSM I have > to provide own allocator (although creation of non-releasing memory > allocator should not be a big issue). The dsm_toc infrastructure would solve this problem. > 4. Current implementation of DSM still suffers from 256Gb problem. Certainly > I can create multiple segments and so provide workaround without using huge > pages, but it complicates allocator. So it sounds like DSM should also support huge pages somehow. I'm not sure what that should look like. > 5. I wonder if I dynamically add new DSM segment - will it be available for > other PostgreSQL processes? For example I run query which loads data in IMCS > and so needs more space and allocates new DSM segment. Then another query is > executed by other PostgreSQL process which tries to access this data. This > process is not forked from the process created this new DSM segment, so I do > not understand how this segment will be mapped to the address space of this > process, preserving address... Certainly I can prohibit dynamic extension of > IMCS storage (hoping that in this case there will be no such problem with > DSM). But in this case we will loose the main advantage of using DSM instead > of old schema of plugin's private shared memory. You can definitely dynamically add a new DSM segment; that's the point of making it *dynamic* shared memory. What's a bit tricky as things stand today is making sure that it sticks around. The current model is that the DSM segment is destroyed when the last process unmaps it. It would be easy enough to lift that limitation on systems other than Windows; we could just add a dsm_keep_until_shutdown() API or something similar. But on Windows, segments are *automatically* destroyed *by the operating system* when the last process unmaps them, so it's not quite so clear to me how we can allow it there. The main shared memory segment is no problem because the postmaster always has it mapped, even if no one else does, but that doesn't help for dynamic shared memory segments. > 6. IMCS has some configuration parameters which has to be set through > postgresql.conf. So in any case user has to edit postgresql.conf file. > In case of using DSM it will be not necessary to add IMCS to > shared_preload_libraries list. But I do not think that it is so restrictive > and critical requirement, is it? I don't really see a problem here. One of the purposes of dynamic shared memory (and dynamic background workers) is precisely that you don't *necessarily* need to put extensions that use shared memory in shared_preload_libraries - or in other words, you can add the extension to a running server without restarting it. If you know in advance that you will want it, you probably still *want* to put it in shared_preload_libraries, but part of the idea is that we can get away from requiring that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления: