Re: 9.5: Memory-bounded HashAgg

Поиск
Список
Период
Сортировка
От Jeff Davis
Тема Re: 9.5: Memory-bounded HashAgg
Дата
Msg-id 1408607390.2335.247.camel@jeff-desktop
обсуждение исходный текст
Ответ на Re: 9.5: Memory-bounded HashAgg  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: 9.5: Memory-bounded HashAgg
Список pgsql-hackers
On Wed, 2014-08-20 at 14:32 -0400, Robert Haas wrote:
> Well, I think you're certainly right that a hash table lookup is more
> expensive than modulo on a 32-bit integer; so much is obvious.  But if
> the load factor is not too large, I think that it's not a *lot* more
> expensive, so it could be worth it if it gives us other advantages.
> As I see it, the advantage of Jeff's approach is that it doesn't
> really matter whether our estimates are accurate or not.  We don't
> have to decide at the beginning how many batches to do, and then
> possibly end up using too much or too little memory per batch if we're
> wrong; we can let the amount of memory actually used during execution
> determine the number of batches.  That seems good.  Of course, a hash
> join can increase the number of batches on the fly, but only by
> doubling it, so you might go from 4 batches to 8 when 5 would really
> have been enough.  And a hash join also can't *reduce* the number of
> batches on the fly, which might matter a lot.  Getting the number of
> batches right avoids I/O, which is a lot more expensive than CPU.

My approach uses partition counts that are powers-of-two also, so I
don't think that's a big differentiator. In principle my algorithm could
adapt to other partition counts, but I'm not sure how big of an
advantage there is.

I think the big benefit of my approach is that it doesn't needlessly
evict groups from the hashtable. Consider input like 0, 1, 0, 2, ..., 0,
N. For large N, if you evict group 0, you have to write out about N
tuples; but if you leave it in, you only have to write out about N/2
tuples. The hashjoin approach doesn't give you any control over
eviction, so you only have about 1/P chance of saving the skew group
(where P is the ultimate number of partitions). With my approach, we'd
always keep the skew group in memory (unless we're very unlucky, and the
hash table fills up before we even see the skew value).

Regards,Jeff Davis





В списке pgsql-hackers по дате отправления:

Предыдущее
От: Heikki Linnakangas
Дата:
Сообщение: Re: [TODO] Process pg_hba.conf keywords as case-insensitive
Следующее
От: Julien Rouhaud
Дата:
Сообщение: [TODO] Track number of files ready to be archived in pg_stat_archiver