Re: A better way than tweaking NTUP_PER_BUCKET

Поиск
Список
Период
Сортировка
От Stephen Frost
Тема Re: A better way than tweaking NTUP_PER_BUCKET
Дата
Msg-id CAOuzzgq0H2-CcMUy-xZwY0D-6od5JMdH8nvfR=mO_KJJveotKA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: A better way than tweaking NTUP_PER_BUCKET  (Simon Riggs <simon@2ndQuadrant.com>)
Список pgsql-hackers
On Sunday, June 23, 2013, Simon Riggs wrote:
On 23 June 2013 03:16, Stephen Frost <sfrost@snowman.net> wrote:

> Will think on it more.

Some other thoughts related to this...

* Why are we building a special kind of hash table? Why don't we just
use the hash table code that we in every other place in the backend.
If that code is so bad why do we use it everywhere else? That is
extensible, so we could try just using that. (Has anyone actually
tried?)

I've not looked at the hash table in the rest of the backend. 
 
* We're not thinking about cache locality and set correspondence
either. If the join is expected to hardly ever match, then we should
be using a bitmap as a bloom filter rather than assuming that a very
large hash table is easily accessible.

That's what I was suggesting earlier, though I don't think it's technically a bloom filter- doesn't that require multiple hash functions?I don't think we want to require every data type to provide multiple hash functions.   
 
* The skew hash table will be hit frequently and would show good L2
cache usage. I think I'll try adding the skew table always to see if
that improves the speed of the hash join.

The skew tables is just for common values though...   To be honest, I have some doubts about that structure really being a terribly good approach for anything which is completely in memory. 

Thanks,

Stephen 

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Stephen Frost
Дата:
Сообщение: Re: A better way than tweaking NTUP_PER_BUCKET
Следующее
От: Dean Rasheed
Дата:
Сообщение: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]