Re: [HACKERS] PATCH: two slab-like memory allocators

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: [HACKERS] PATCH: two slab-like memory allocators
Дата
Msg-id 20170214022221.obhdeyfn34h6hkwj@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: [HACKERS] PATCH: two slab-like memory allocators  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Ответы Re: [HACKERS] PATCH: two slab-like memory allocators  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список pgsql-hackers
Hi,

On 2017-02-11 14:40:18 +0100, Tomas Vondra wrote:
> On 02/11/2017 02:33 AM, Andres Freund wrote:
> > > I have a hard time believing this the cache efficiency of linked lists
> > > (which may or may not be real in this case) out-weights this, but if you
> > > want to try, be my guest.
> > 
> > I'm not following - why would there be overhead in anything for
> > allocations bigger than 4 (or maybe 8) bytes? You can store the list
> > (via chunk ids, not pointers) inside the chunks itself, where data
> > otherwise would be.  And I don't see why you'd need a doubly linked
> > list, as the only operations that are needed are to push to the front of
> > the list, and to pop from the front of the list - and both operations
> > are simple to do with a singly linked list?
> > 
> 
> Oh! I have not considered storing the chunk indexes (for linked lists) in
> the chunk itself, which obviously eliminates the overhead concerns, and
> you're right a singly-linked list should be enough.
>
> There will be some minimum-chunk-size requirement, depending on the block
> size/chunk size. I wonder whether it makes sense to try to be smart and make
> it dynamic, so that we only require 1B or 2B for cases when only that many
> chunks fit into a block, or just say that it's 4B and be done with it.

I doubt it's worth it - it seems likely that the added branch is more
noticeable overall than the possible savings of 3 bytes. Also, won't the
space be lost due to alignment *anyway*?
+    /* chunk, including SLAB header (both addresses nicely aligned) */
+    fullChunkSize = MAXALIGN(sizeof(SlabChunkData) + MAXALIGN(chunkSize));

In that case I'd just Assert(MAXIMUM_ALIGNOF >= sizeof(slist_head)) and
use a plain slist - no point in being more careful than that.


> I mean 2^32 chunks ought to be enough for anyone, right?

Yea, that seems enough; but given the alignment thing pointed out above,
I think we can just use plain pointers - and that definitely should be
enough :P


> I'm still not buying the cache efficiency argument, though. One of the
> reasons is that the implementation prefers blocks with fewer free chunks
> when handling palloc(), so pfree() is making the block less likely to be
> chosen by the next palloc().

That'll possibly de-optimize L1, but for L2 usage the higher density
seems like it'll be a win.  All this memory is only accessed by a single
backend, so packing as densely as possible is good.


> > If so, if you have e.g. 8 byte allocations and 64kb sized blocks, you
> > end up with an array of 1024 doubly linked lists, which'll take up 64kb
> > on its own.  And there a certainly scenarios where even bigger block
> > sizes could make sense.   That's both memory overhead, and runtime
> > overhead, because at reset-time we'll have to check the whole array
> > (which'll presumably largely be empty lists).  Resetting is a pretty
> > common path...
> > 
> 
> True, but it's not entirely clear if resetting is common for the paths where
> we use those new allocators.

That's fair enough.  There's still the memory overhead, but I guess we
can also live with that.


> Also, if we accept that it might be a problem, what other solution you
> propose? I don't think just merging everything into a single list is a good
> idea, for the reasons I explained before (it might make the resets somewhat
> less expensive, but it'll make pfree() more expensive).

Now that I think about it, a binary heap, as suggested elsewhere, isn't
entirely trivial to use for this - it's more or less trivial to "fix"
the heap after changing an element's value, but it's harder to find that
element first.

But a two-level list approach seems like it could work quite well -
basically a simplified skip-list.  A top-level list contains that has
the property that all the elements have a distinct #free, and then
hanging of those sub-lists for all the other blocks with the same number
of chunks.

I think that'd be a better implementation, but I can understand if you
don't immediately want to go there.


> What might work is replacing the array with a list, though. So we'd have a
> list of lists, which would eliminate the array overhead.

That seems likely to be significantly worse, because a) iteration is
more expensive b) accessing the relevant list to move between two
different "freecount" lists would be O(n).

- Andres



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: [HACKERS] renaming pg_resetxlog to pg_resetwal has broken pg_upgrade.
Следующее
От: Kyotaro HORIGUCHI
Дата:
Сообщение: Re: [HACKERS] Logical Replication and Character encoding