Re: why postgresql define NTUP_PER_BUCKET as 10, not other numbers smaller

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: why postgresql define NTUP_PER_BUCKET as 10, not other numbers smaller
Дата
Msg-id 29607.1402410454@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: why postgresql define NTUP_PER_BUCKET as 10, not other numbers smaller  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: why postgresql define NTUP_PER_BUCKET as 10, not other numbers smaller
Список pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
> On Mon, Jun 9, 2014 at 11:09 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I'm quite prepared to believe that we should change NTUP_PER_BUCKET ...
>> but appealing to standard advice isn't a good basis for arguing that.
>> Actual performance measurements (in both batched and unbatched cases)
>> would be a suitable basis for proposing a change.

> Well, it's all in what scenario you test, right?  If you test the case
> where something overflows work_mem as a result of the increased size
> of the bucket array, it's always going to suck.  And if you test the
> case where that doesn't happen, it's likely to win.  I think Stephen
> Frost has already done quite a bit of testing in this area, on
> previous threads.  But there's no one-size-fits-all solution.

I don't really recall any hard numbers being provided.  I think if we
looked at some results that said "here's the average gain, and here's
the worst-case loss, and here's an estimate of how often you'd hit
the worst case", then we could make a decision.

However, I notice that it's already the case that we make a
to-batch-or-not-to-batch decision on the strength of some very crude
numbers during ExecChooseHashTableSize, and we explicitly don't consider
palloc overhead there.  It would certainly be easy enough to use two
different NTUP_PER_BUCKET target load factors depending on which path
is being taken in ExecChooseHashTableSize.  So maybe part of the answer is
to not require those numbers to be the same.
        regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Gurjeet Singh
Дата:
Сообщение: Re: /proc/self/oom_adj is deprecated in newer Linux kernels
Следующее
От: Robert Haas
Дата:
Сообщение: Re: "cancelling statement due to user request error" occurs but the transaction has committed.