Re: Re: why postgresql define NTUP_PER_BUCKET as 10, not other numbers smaller

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: Re: why postgresql define NTUP_PER_BUCKET as 10, not other numbers smaller
Дата
Msg-id CA+TgmoYYXP34wwA7sQWLeTCaTfB5+NBAbZ6QoSpAu7W+46Xqfw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Re: why postgresql define NTUP_PER_BUCKET as 10, not other numbers smaller  (Jeff Janes <jeff.janes@gmail.com>)
Список pgsql-hackers
On Tue, Jun 10, 2014 at 1:43 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
> On Tue, Jun 10, 2014 at 5:17 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> The problem case is when you have 1 batch and the increased memory
>> consumption causes you to switch to 2 batches.  That's expensive.  It
>> seems clear based on previous testing that *on the average*
>> NTUP_PER_BUCKET = 1 will be better, but in the case where it causes an
>> increase in the number of batches it will be much worse - particularly
>> because the only way we ever increase the number of batches is to
>> double it, which is almost always going to be a huge loss.
>
> Is there a reason we don't do hybrid hashing, where if 80% fits in memory
> than we write out only the 20% that doesn't? And then when probing the table
> with the other input, the 80% that land in in-memory buckets get handled
> immediately, and only the 20 that land in the on-disk buckets get written
> for the next step?

We have an optimization that is a little bit like that.  The "skew"
hash join stuff tries to (essentially) ensure that the MCVs are in the
first batch.

But more could probably be done.  For example, suppose we have 256
buckets.  If the hash table overflows work_mem, we could write the
contents of *one bucket* out to disk, rather than (as we currently do)
half of the table.  If we overflow again, we write another bucket.
When the number of buckets written reaches half the total, we split
all of the remaining buckets so that all 256 slots are once again
active.  Repeat as needed.

If something like that worked out, it would drastically reduce the
penalty for slightly overrunning work_mem.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: API change advice: Passing plan invalidation info from the rewriter into the planner?
Следующее
От: Andres Freund
Дата:
Сообщение: Re: "RETURNING PRIMARY KEY" syntax extension