Re: rethinking dense_alloc (HashJoin) as a memory context

Поиск

Список

Период

Сортировка

От	Peter Geoghegan
Тема	Re: rethinking dense_alloc (HashJoin) as a memory context
Дата	18 июля 2016 г. 21:33:20
Msg-id	CAM3SWZSX=pHU+L1pVxVFC_FYk+uKCDBNvha1d8dGhtbb72CAvQ@mail.gmail.com обсуждение исходный текст
Ответ на	Re: rethinking dense_alloc (HashJoin) as a memory context (Robert Haas <robertmhaas@gmail.com>)
Список	pgsql-hackers

Дерево обсуждения

On Mon, Jul 18, 2016 at 7:56 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> The test case I used previously was an external sort, which does lots
> of retail pfrees.  Now that we've mostly abandoned replacement
> selection, there will be many fewer pfrees while building runs, I
> think, but still quite a few while merging runs.

Surely you mean that in 9.6 there are just as many palloc() + pfree()
calls as before when building runs, but many fewer when merging
(provided you limit your consideration to a final on-the-fly merge,
which are the large majority of merges in practice)? Nothing changed
about how tuplesort caller tuples are originally formed in 9.6, so
work remains even there.

I think we should be using batch memory for caller tuples (e.g.,
MinimalTuples) past the first run as an optimization, but that isn't
something that I plan to do soon. Separately, I've already written a
patch to make final merges that are not on-the-fly (i.e. the final
merge of a randomAccess caller, where a single materialize output to
one tape is formed) use batch memory, mostly to suit parallel sort
workers. Parallel sort could increase the prevalence of non-on-the-fly
merges by quite a bit, so that is on my agenda for the next release.

> Now it might be the
> case that if the allocating is fast enough and we save a bunch of
> memory, spending a few additional cycles freeing things is no big
> deal.  It might also be the case that this is problematic in a few
> cases but that we can eliminate those cases.  It's likely to take some
> work, though.

Perhaps I simply lack imagination here, but I still suspect that
ad-hoc approaches will tend to work best, because most of the benefit
can be derived from specialized, precise memory management (what I've
called batch memory) for just a few modules, and what remains isn't
several broad swathes that can be delineated easily. I can see a
"palloc a lot and don't worry too much about pfrees" allocator having
some value, but I suspect that that isn't going to move the needle in
the same way.

-- 
Peter Geoghegan

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Tom Lane
Дата: 18 июля 2016 г., 19:58:49
Сообщение: Re: Regression tests vs existing users in an installation

Следующее

От: Piotr Stefaniak
Дата: 18 июля 2016 г., 22:00:27
Сообщение: Re: More parallel-query fun

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: rethinking dense_alloc (HashJoin) as a memory context

Предыдущее

Следующее