Re: 9.5: Memory-bounded HashAgg
| От | Jeff Davis | 
|---|---|
| Тема | Re: 9.5: Memory-bounded HashAgg | 
| Дата | |
| Msg-id | 1408607390.2335.247.camel@jeff-desktop обсуждение исходный текст | 
| Ответ на | Re: 9.5: Memory-bounded HashAgg (Robert Haas <robertmhaas@gmail.com>) | 
| Ответы | Re: 9.5: Memory-bounded HashAgg | 
| Список | pgsql-hackers | 
On Wed, 2014-08-20 at 14:32 -0400, Robert Haas wrote: > Well, I think you're certainly right that a hash table lookup is more > expensive than modulo on a 32-bit integer; so much is obvious. But if > the load factor is not too large, I think that it's not a *lot* more > expensive, so it could be worth it if it gives us other advantages. > As I see it, the advantage of Jeff's approach is that it doesn't > really matter whether our estimates are accurate or not. We don't > have to decide at the beginning how many batches to do, and then > possibly end up using too much or too little memory per batch if we're > wrong; we can let the amount of memory actually used during execution > determine the number of batches. That seems good. Of course, a hash > join can increase the number of batches on the fly, but only by > doubling it, so you might go from 4 batches to 8 when 5 would really > have been enough. And a hash join also can't *reduce* the number of > batches on the fly, which might matter a lot. Getting the number of > batches right avoids I/O, which is a lot more expensive than CPU. My approach uses partition counts that are powers-of-two also, so I don't think that's a big differentiator. In principle my algorithm could adapt to other partition counts, but I'm not sure how big of an advantage there is. I think the big benefit of my approach is that it doesn't needlessly evict groups from the hashtable. Consider input like 0, 1, 0, 2, ..., 0, N. For large N, if you evict group 0, you have to write out about N tuples; but if you leave it in, you only have to write out about N/2 tuples. The hashjoin approach doesn't give you any control over eviction, so you only have about 1/P chance of saving the skew group (where P is the ultimate number of partitions). With my approach, we'd always keep the skew group in memory (unless we're very unlucky, and the hash table fills up before we even see the skew value). Regards,Jeff Davis
В списке pgsql-hackers по дате отправления: