Re: New style of hash join proposal

Поиск
Список
Период
Сортировка
От Gregory Stark
Тема Re: New style of hash join proposal
Дата
Msg-id 87myowaecy.fsf@oxford.xeocode.com
обсуждение исходный текст
Ответ на Re: New style of hash join proposal  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
"Tom Lane" <tgl@sss.pgh.pa.us> writes:

> Gregory Stark <stark@enterprisedb.com> writes:
>> "Tom Lane" <tgl@sss.pgh.pa.us> writes:
>>> Nested Loop  (cost=5.39..198.81 rows=51 width=244)
>>> ->  HashAggregate  (cost=1.06..1.11 rows=5 width=4)
>>>     ->  Seq Scan on int4_tbl b  (cost=0.00..1.05 rows=5 width=4)
>>> ->  Bitmap Heap Scan on tenk1 a  (cost=4.33..39.41 rows=10 width=244)
>>> Recheck Cond: (a.thousand = b.f1)
>>>     ->  Bitmap Index Scan on tenk1_thous_tenthous  (cost=0.00..4.33 rows=10 width=0)
>>> Index Cond: (a.thousand = b.f1)
>>> (7 rows)
>
>> Sure, but that's still re-executing the bitmap index scan 51 times -- possibly
>> having to fetch the same records off disk repeatedly.

sorry 5 times

> It's not fetching any record repeatedly, because the HashAggregate
> step eliminated duplicate keys on the other side.

Only because it's an IN query. If it was a normal join which hash joins are
perfectly capable of handling then it would be. As it is it could be fetching
the same page repeatedly but not precisely the same tuples. 

It happens that transforming this query to 

explain analyze select * from tenk1 a where thousand = any (array(select f1 from int4_tbl b));

makes it run about 40% faster. That could just be that the hash is small
enough that a linear search is more efficient than calling hashint4 though.
(Perhaps we should be protecting against that in dynahash, actually)

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about
EnterpriseDB'sPostgreSQL training!
 


В списке pgsql-hackers по дате отправления:

Предыдущее
От: KaiGai Kohei
Дата:
Сообщение: Re: [0/4] Proposal of SE-PostgreSQL patches
Следующее
От: Tom Lane
Дата:
Сообщение: Re: count(*) performance improvement ideas