Re: Hash Join cost estimates

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Hash Join cost estimates
Дата
Msg-id 10502.1365107399@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Hash Join cost estimates  (Stephen Frost <sfrost@snowman.net>)
Ответы Re: Hash Join cost estimates  (Stephen Frost <sfrost@snowman.net>)
Список pgsql-hackers
Stephen Frost <sfrost@snowman.net> writes:
> Looking with opannotate, there's two main hotspots in
> ExecScanHashBucket:

>  12846 17.4001 :        hashTuple = hashtable->buckets[hjstate->hj_CurBucketNo];
> and
>  22100 29.9348 :        hashTuple = hashTuple->next;

Those are, of course, pretty trivial statements; so the way I read this
is that the fundamental slowdown comes from the hash table being large
compared to the CPU's cache, so that you're incurring lots of cache
misses at these particular fetches.  (You might be able to confirm that
if you can set oprofile to count cache misses rather than wall clock
time.)

> I'm certainly curious about those, but I'm also very interested in the
> possibility of making NTUP_PER_BUCKET much smaller, or perhaps variable
> depending on the work_mem setting.

Not sure about that.  That would make the hash-bucket-header array
larger without changing the size of the rest of the hash table, thus
probably making the CPU cache situation worse not better (which would
manifest as more time at the first of these two statements relative to
the second).

Can you add some instrumentation code to get us information about the
average/max length of the bucket chains?  And maybe try to figure out
how many distinct hash values per bucket, which would give us a clue
whether your two-level-list idea is worth anything.
        regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: Multi-pass planner
Следующее
От: Robert Haas
Дата:
Сообщение: Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)