Re: [HACKERS] Poor memory context performance in large hash joins

Поиск

Список

Период

Сортировка

От	Jeff Janes
Тема	Re: [HACKERS] Poor memory context performance in large hash joins
Дата	24 февраля 2017 г. 22:57:17
Msg-id	CAMkU=1xdSg0g8JuwUq=zro=PG508_0qKaKFLQbEcGciT4hAYLw@mail.gmail.com обсуждение исходный текст
Ответ на	Re: [HACKERS] Poor memory context performance in large hash joins (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы	Re: [HACKERS] Poor memory context performance in large hash joins
Список	pgsql-hackers

Дерево обсуждения

On Fri, Feb 24, 2017 at 10:59 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Jeff Janes <jeff.janes@gmail.com> writes:
> On Thu, Feb 23, 2017 at 2:28 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Maybe it's time to convert that to a doubly-linked list.

> I don't think that would help. You would need some heuristic to guess
> whether the chunk you are looking for is near the front, or near the end.

Uh, what? In a doubly-linked list, you can remove an element in O(1)
time, you don't need any searching. It basically becomes
item->prev->next = item->next;
item->next->prev = item->prev;
modulo possible special cases for the head and tail elements.

Currently it is walking the chain to identify which block holds the chunk in the first place, not just to get the pointer to the previous block. But I see that that could be fixed by pointer arithmetic once there is a reason to fix it. Which there isn't a reason to as long as you need to walk the chain to get the prev pointer anyway. Like this:?

targetblock = (AllocBlock) (((char*)chunk) - ALLOC_BLOCKHDRSZ);

>> Although if the
>> hash code is producing a whole lot of requests that are only a bit bigger
>> than the separate-block threshold, I'd say It's Doing It Wrong. It should
>> learn to aggregate them into larger requests.

> Right now it is using compiled-in 32KB chunks. Should it use something
> like max(32kb,work_mem/128) instead?

I'd say it should double the size of the request each time. That's what
we do in most places.

I thought the design goal there was that the space in old chunks could get re-used into the new chunks in a reasonably fine-grained way. If the last chunk contains just over half of all the data, that couldn't happen.

Cheers,

Jeff

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: [HACKERS] Poor memory context performance in large hash joins