Re: PATCH: hashjoin - gracefully increasing NTUP_PER_BUCKET instead of batching

Поиск

Список

Период

Сортировка

От	Tomas Vondra
Тема	Re: PATCH: hashjoin - gracefully increasing NTUP_PER_BUCKET instead of batching
Дата	14 января 2015 г. 20:03:26
Msg-id	54B6CB73.8020906@2ndquadrant.com обсуждение исходный текст
Ответ на	Re: PATCH: hashjoin - gracefully increasing NTUP_PER_BUCKET instead of batching (Tomas Vondra <tv@fuzzy.cz>)
Ответы	Re: PATCH: hashjoin - gracefully increasing NTUP_PER_BUCKET instead of batching
Список	pgsql-hackers

Дерево обсуждения

On 11.12.2014 23:46, Tomas Vondra wrote:
> On 11.12.2014 22:16, Robert Haas wrote:
>> On Thu, Dec 11, 2014 at 2:51 PM, Tomas Vondra <tv@fuzzy.cz> wrote:
>>
>>> The idea was that if we could increase the load a bit (e.g. using 2
>>> tuples per bucket instead of 1), we will still use a single batch in
>>> some cases (when we miss the work_mem threshold by just a bit). The
>>> lookups will be slower, but we'll save the I/O.
>>
>> Yeah.  That seems like a valid theory, but your test results so far
>> seem to indicate that it's not working out like that - which I find
>> quite surprising, but, I mean, it is what it is, right?
> 
> Not exactly. My tests show that as long as the outer table batches fit
> into page cache, icreasing the load factor results in worse performance
> than batching.
> 
> When the outer table is "sufficiently small", the batching is faster.
> 
> Regarding the "sufficiently small" - considering today's hardware, we're
> probably talking about gigabytes. On machines with significant memory
> pressure (forcing the temporary files to disk), it might be much lower,
> of course. Of course, it also depends on kernel settings (e.g.
> dirty_bytes/dirty_background_bytes).
> 
> If we could identify those cases (at least the "temp files > RAM") then
> maybe we could do this. Otherwise we're going to penalize all the other
> queries ...
> 
> Maybe the best solution for now is "increase the work_mem a bit"
> recommendation.

I think it's time to mark this patch as rejected (or maybe returned with
feedback). The patch was meant as an attempt to implement Robert's idea
from the hashjoin patch, but apparently we have no clear idea how to do
it without hurting performance for many existing users.

Maybe we can try later again, but there's no poin in keeping this in the
current CF.

Any objections?

regards
Tomas

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: PATCH: hashjoin - gracefully increasing NTUP_PER_BUCKET instead of batching