Re: Horribly slow hash join

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Horribly slow hash join
Дата
Msg-id 10603.1082218095@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Horribly slow hash join  ("Jim C. Nasby" <jim@nasby.net>)
Ответы Re: Horribly slow hash join  (Greg Stark <gsstark@mit.edu>)
Re: Horribly slow hash join  (Dennis Bjorklund <db@zigo.dhs.org>)
Список pgsql-performance
[ resending because I fat-fingered the cc: to the list ]

I see the problem: all the entries in your work_units column have the
low 32 bits equal to zero.

regression=# select distinct work_units % (2^32)::bigint from Trank_work_overall;
 ?column?
----------
        0
(1 row)

The hash function for int8 only takes the low word into account, so all
of the entries end up on the same hash chain, resulting in worst-case
behavior.  This applies to both your hash join and hash aggregate cases.

We could change the hash function, perhaps, but then we'd just have
different cases where there's a problem ... hashing will always fail on
*some* set of inputs.  (Also, I have been harboring some notions of
supporting cross-type hash joins for integer types, which will not work
unless small int8 values hash the same as int4 etc.)

I guess the real issue is why are you encoding work_units like that?

            regards, tom lane

В списке pgsql-performance по дате отправления:

Предыдущее
От: Jeff
Дата:
Сообщение: Re: [ SOLVED ] select count(*) very slow on an already
Следующее
От: Marcos Martínez(R)
Дата:
Сообщение: Re: Horribly slow hash join