Re: sb_alloc: a new memory allocator for PostgreSQL

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: sb_alloc: a new memory allocator for PostgreSQL
Дата
Msg-id CA+TgmobCZsVsoUtSLZ_xL5u-WwmjywvQtZWRULtBETey8dJPNQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: sb_alloc: a new memory allocator for PostgreSQL  (Greg Stark <stark@mit.edu>)
Список pgsql-hackers
On Tue, May 6, 2014 at 9:25 AM, Greg Stark <stark@mit.edu> wrote:
> On Tue, May 6, 2014 at 2:04 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>  I also didn't find anything that looked like our
>> memory context paradigm, and in particular the ability to cheaply
>> reset a context, in any other allocator.
>
> You probably knew this but just in case the term for this strategy is
> called a "ripcord allocator". I believe GCC had one, not sure if it
> still uses it. I doubt any existing ones are especially helpful though
> compared to looking at regular allocators and seeing how to integrate
> them.

Actually, I hadn't heard that term.

> I assume you're also aware of lock-free data structures and the like.
> A memory allocator seems like something that needs to be super aware
> of causing concurrency bottlenecks since it has no idea where it'll be
> used.

Lock-free data structures are cool, but you tend to pay for contention
avoidance with larger constant overheads.  Earlier versions of this
included more provisions for reducing locking bottlenecks that the
current version; that stuff ended up being surprisingly expensive and
I had to rip it out.  There are a few remaining vestiges that need to
be cleaned up yet.  What I'm thinking is probably cheap enough to live
with - and what the patch roughly contemplates right now - is one
lwlock per size class; so shared allocations can proceed in parallel
if they're for objects of different sizes, but not otherwise.  If we
need an allocator that is better-optimized for concurrency than that,
then I think that's going to need to be a separate effort, and the
result will be something that nobody will want to use for
backend-private allocations, because the overhead in the non-contended
case will be too high.

I don't expect concurrency to be a big problem for the problems I
intend to solve in the medium term.  For parallel internal sort, I
actually only need one backend at a time to be able to allocate from
the shared memory segment; the other backends just move data around,
without actually allocating anything.  So the locking is a non-issue.
Parallel external sort might be different; e.g. you can imagine the
user backend pushing tuples into the DSM and worker backends pulling
them out and writing them to tuplestores, or something like that.
Similarly, you can imagine something like a parallel hash table build
with multiple inserts working at once.  But I don't want to spend too
much time worrying about concurrency for future projects at the
expense of solving problems nearer at hand; one could argue that I've
gone too far in that direction already.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: [PATCH] `pg_dump -Fd` doesn't check write return status...
Следующее
От: Simon Riggs
Дата:
Сообщение: Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers