Re: costing of hash join

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: costing of hash join
Дата
Msg-id 27396.1388789455@sss.pgh.pa.us
обсуждение исходный текст
Ответ на costing of hash join  (Jeff Janes <jeff.janes@gmail.com>)
Список pgsql-hackers
Jeff Janes <jeff.janes@gmail.com> writes:
> I'm trying to figure out why hash joins seem to be systematically underused
> in my hands.  In the case I am immediately looking at it prefers a merge
> join with both inputs getting seq scanned and sorted, despite the hash join
> being actually 2 to 3 times faster, where inputs and intermediate working
> sets are all in memory.  I normally wouldn't worry about a factor of 3
> error, but I see this a lot in many different situations.  The row
> estimates are very close to actual, the errors is only in the cpu estimates.

Can you produce a test case for other people to look at?

What datatype(s) are the join keys?

> A hash join is charged cpu_tuple_cost for each inner tuple for inserting it
> into the hash table:

Doesn't seem like monkeying with that is going to account for a 3x error.

Have you tried using perf or oprofile or similar to see where the time is
actually, rather than theoretically, going?
        regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jeff Janes
Дата:
Сообщение: costing of hash join
Следующее
От: Peter Geoghegan
Дата:
Сообщение: Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE