RIGHT/FULL OUTER hash joins (was Re: small table left outer join big table)

Поиск
Список
Период
Сортировка
От Tom Lane
Тема RIGHT/FULL OUTER hash joins (was Re: small table left outer join big table)
Дата
Msg-id 16990.1293723954@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: small table left outer join big table  (Jie Li <jay23jack@gmail.com>)
Ответы Re: RIGHT/FULL OUTER hash joins (was Re: small table left outer join big table)  (Robert Haas <robertmhaas@gmail.com>)
Re: RIGHT/FULL OUTER hash joins (was Re: small table left outer join big table)  (Simon Riggs <simon@2ndQuadrant.com>)
Список pgsql-hackers
I had an epiphany about this topic, or actually two of them.

1. Whether or not you think there's a significant performance reason
to support hash right joins, there's a functionality reason.  The
infrastructure for right join could just as easily do full joins.
And AFAICS, a hash full join would only require one hashable join
clause --- the other FULL JOIN ON conditions could be anything at all.
This is unlike the situation for merge join, where all the JOIN ON
conditions have to be mergeable or it doesn't work right.  So we could
greatly reduce the scope of the dreaded "FULL JOIN is only supported
with merge-joinable join conditions" error.  (Well, okay, it's not
*that* dreaded, but people complain about it occasionally.)

2. The obvious way to implement this would involve adding an extra bool
field to struct HashJoinTupleData.  The difficulty with that, and the
reason I'd been resistant to the whole idea, is that it'd eat up a full
word per hashtable entry because of alignment considerations.  (On
64-bit machines it'd be free because of alignment considerations, but
that's cold comfort when 32-bit machines are the ones pressed for
address space.)  But we only need one bit, so what about commandeering
an infomask bit in the tuple itself?  For the initial implementation
I'd be inclined to take one of the free bits in t_infomask2.  We could
actually get away with overlaying the flag bit with one of the tuple
visibility bits, since it will only be used in tuples that are in the
in-memory hash table, which don't need visibility info anymore.  But
that seems like a kluge that could wait until we really need the flag
space.

Comments?
        regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: SLRU API tweak
Следующее
От: Robert Haas
Дата:
Сообщение: Re: and it's not a bunny rabbit, either