Re: A better way than tweaking NTUP_PER_BUCKET

Поиск
Список
Период
Сортировка
От Stephen Frost
Тема Re: A better way than tweaking NTUP_PER_BUCKET
Дата
Msg-id CAOuzzgqwT3jwjSE8=npDQ5+why2KDMpDZKND4GsG+vjfgrCCHg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: A better way than tweaking NTUP_PER_BUCKET  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Список pgsql-hackers
On Saturday, June 22, 2013, Heikki Linnakangas wrote:
On 22.06.2013 19:19, Simon Riggs wrote:
So I think that (2) is the best route: Given that we know with much
better certainty the number of rows in the scanned-relation, we should
be able to examine our hash table after it has been built and decide
whether it would be cheaper to rebuild the hash table with the right
number of buckets, or continue processing with what we have now. Which
is roughly what Heikki proposed already, in January.

Back in January, I wrote a quick patch to experiment with rehashing when the hash table becomes too full. It was too late to make it into 9.3 so I didn't pursue it further back then, but IIRC it worked. If we have the capability to rehash, the accuracy of the initial guess becomes much less important.

What we're hashing isn't going to change mid-way through or be updated after we've started doing lookups against it. 

Why not simply scan and queue the data and then build the hash table right the first time?  Also, this patch doesn't appear to address dups and therefore would rehash unnecessarily. There's no point rehashing into more buckets if the buckets are only deep due to lots of dups. Figuring out how many distinct values there are, in order to build the best hash table, is actually pretty expensive compared to how quickly we can build the table today. Lastly, this still encourages collisions due to too few buckets. If we would simply start with more buckets outright we'd reduce the need to rehash..

Thanks,

Stephen

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Simon Riggs
Дата:
Сообщение: Re: A better way than tweaking NTUP_PER_BUCKET
Следующее
От: ian link
Дата:
Сообщение: Re: Support for RANGE ... PRECEDING windows in OVER