Re: bad estimation together with large work_mem generates terrible slow hash joins

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: bad estimation together with large work_mem generates terrible slow hash joins
Дата
Msg-id 5410979C.6000509@vmware.com
обсуждение исходный текст
Ответ на Re: bad estimation together with large work_mem generates terrible slow hash joins  (Tomas Vondra <tv@fuzzy.cz>)
Ответы Re: bad estimation together with large work_mem generates terrible slow hash joins  (Robert Haas <robertmhaas@gmail.com>)
Re: bad estimation together with large work_mem generates terrible slow hash joins  (Tomas Vondra <tv@fuzzy.cz>)
Re: bad estimation together with large work_mem generates terrible slow hash joins  (Tomas Vondra <tv@fuzzy.cz>)
Список pgsql-hackers
On 09/10/2014 01:49 AM, Tomas Vondra wrote:
> On 9.9.2014 16:09, Robert Haas wrote:
>> On Mon, Sep 8, 2014 at 5:53 PM, Tomas Vondra <tv@fuzzy.cz> wrote:
>>> So I only posted the separate patch for those who want to do a review,
>>> and then a "complete patch" with both parts combined. But it sure may be
>>> a bit confusing.
>>
>> Let's do this: post each new version of the patches only on this
>> thread, as a series of patches that can be applied one after another.
>
> OK, attached. Apply in this order
>
> 1) dense-alloc-v5.patch
> 2) hashjoin-nbuckets-v12.patch

The dense-alloc-v5.patch looks good to me. I have committed that with 
minor cleanup (more comments below). I have not looked at the second patch.

> I also did a few 'minor' changes to the dense allocation patch, most
> notably:
>
> * renamed HashChunk/HashChunkData to MemoryChunk/MemoryChunkData
>    The original naming seemed a bit awkward.

That's too easy to confuse with regular MemoryContext and AllocChunk 
stuff. I renamed it to HashMemoryChunk.

> * the chunks size is 32kB (instead of 16kB), and we're using 1/4
>    threshold for 'oversized' items
>
>    We need the threshold to be >=8kB, to trigger the special case
>    within AllocSet. The 1/4 rule is consistent with ALLOC_CHUNK_FRACTION.

Should we care about the fact that if there are only a few tuples, we 
will nevertheless waste 32kB of memory for the chunk? I guess not, but I 
thought I'd mention it. The smallest allowed value for work_mem is 64kB.

> I also considered Heikki's suggestion that while rebatching, we can
> reuse chunks with a single large tuple, instead of copying it
> needlessly. My attempt however made the code much uglier (I almost used
> a GOTO, but my hands rejected to type it ...). I'll look into that.

I tried constructing a test case where the rehashing time would be 
significant enough for this to matter, but I couldn't find one. Even if 
the original estimate on the number of batches is way too small, we ramp 
up quickly to a reasonable size and the rehashing time is completely 
insignificant compared to all the other work. So I think we can drop 
that idea.

- Heikki




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Geoghegan
Дата:
Сообщение: Re: B-Tree support function number 3 (strxfrm() optimization)
Следующее
От: Robert Haas
Дата:
Сообщение: Re: bad estimation together with large work_mem generates terrible slow hash joins