Re: dynamic shared memory
От | Noah Misch |
---|---|
Тема | Re: dynamic shared memory |
Дата | |
Msg-id | 20130903033150.GA119849@tornado.leadboat.com обсуждение исходный текст |
Ответ на | Re: dynamic shared memory (Andres Freund <andres@2ndquadrant.com>) |
Список | pgsql-hackers |
On Tue, Sep 03, 2013 at 12:52:22AM +0200, Andres Freund wrote: > On 2013-09-01 12:07:04 -0400, Noah Misch wrote: > > On Sun, Sep 01, 2013 at 05:08:38PM +0200, Andres Freund wrote: > > > On 2013-09-01 09:24:00 -0400, Noah Misch wrote: > > > > The difficulty depends on whether processes other than the segment's creator > > > > will attach anytime or only as they start. Attachment at startup is enough > > > > for parallel query, but it's not enough for something like lock table > > > > expansion. I'll focus on the attach-anytime case since it's more general. > > > > > > Even on startup it might get more complicated than one immediately > > > imagines on EXEC_BACKEND type platforms because their memory layout > > > doesn't need to be the same. The more shared memory you need, the harder > > > that will be. Afair > > > > Non-Windows EXEC_BACKEND is already facing a dead end that way. > > Not sure whether you mean non-windows EXEC_BACKEND isn't going to be > supported for much longer or that it already has problems. It already has problems: ASLR measures sometimes prevent reattachment of the main shared memory segment. Multiplying the combined size of our fixed-address mappings does not push us over some threshold where this becomes a problem, because it is already a problem. > > > Note that allocating a large mapping, even without using it, has > > > noticeable cost, at least under linux. The kernel has to create & copy > > > data to track each pages state (without copying the memory content's > > > itself due to COW) for every fork afterwards. > So, after reading up on the issue a bit more and reading some more > kernel code, a large mmap(PROT_NONE, MAP_PRIVATE) won't cause much > problems except counting in ulimit -v. It will *not* cause overcommit > violations. mmap(PROT_NONE, MAP_SHARED) will tho, even if not yet > faulted. Which means that to be reliable and not violate overcommit we'd > need to munmap() a chunk of PROT_NONE, MAP_PRIVATE memory, and > immediately (without interceding mallocs, using mmap itself) map it again. > > It only gets really expensive in the sense of making fork expensive if > you set protections on many regions in that mapping individually. Each > mprotect() call will split the VMA into distinct pieces and they won't > get merged even if there are neighboors with the same settings. Thanks for researching that. > > > > I don't foresee fundamental differences on 32-bit. All the allocation > > > > maximums scale down, but that's the usual story for 32-bit. > > > > > > If you actually want to allocate memory after starting up, without > > > carving a section out for that from the beginning, the memory > > > fragmentation will make it very hard to find memory addresses of the > > > same across processes. > > > > True. I wouldn't feel bad if total dynamic shared memory usage above, say, > > 256 MiB were unreliable on 32-bit. If you're still running 32-bit in 2015, > > you probably have a low-memory platform. > > Not sure. I think that will partially depend on whether x32 will have > any success which I still find hard to judge. I won't hold my breath for x32 becoming a common platform for high-memory database servers, regardless of other successes it might find. Not impossible, but I recommend placing trivial priority on maximizing performance for that scenario. > > I think the take-away is that we have a lot of knobs available, not a bright > > line between possible and impossible. Robert opted to omit provision for > > reliable fixed addresses, and the upsides of that decision are the absence of > > a DBA-unfriendly space-reservation GUC, trivial overhead when the APIs are not > > used, and a clearer portability outlook. > > I guess my point is that if we want to develop stuff that requires > reliable addresses, we should build support for that from a low level > up. Not rely on a hack^Wlayer ontop of the actual dynamic shared memory > API. > That is, it should be a flag to dsm_create() that we require a fixed > address and dsm_attach() will then automatically use that or die > trying. Requiring implementations to take care about passing addresses > around and fiddling with mmap/windows api to make sure those mappings > are possible doesn't strike me to be a good idea. I agree. > In the end, you're going to be the primary/first user as far as I > understand things, so you'll have to argue whether we need fixed > addresses or not. I don't think it's a good idea to forgo this decision > on this layer and bolt on another ontop if we decide it's neccessary. We don't need fixed addresses. Parallel internal sort will probably include the equivalent of a SortTuple array in its shared memory segment, and that implies relative pointers to the tuples also stored in shared memory. I expect that wart to be fairly isolated within the code, so little harm done. I don't think we will have at all painted ourselves into a corner, should we wish to lift the limitation later. -- Noah Misch EnterpriseDB http://www.enterprisedb.com
В списке pgsql-hackers по дате отправления: