On Thu, Dec 30, 2010 at 10:45 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I had an epiphany about this topic, or actually two of them.
>
> 1. Whether or not you think there's a significant performance reason
> to support hash right joins, there's a functionality reason. The
> infrastructure for right join could just as easily do full joins.
> And AFAICS, a hash full join would only require one hashable join
> clause --- the other FULL JOIN ON conditions could be anything at all.
> This is unlike the situation for merge join, where all the JOIN ON
> conditions have to be mergeable or it doesn't work right. So we could
> greatly reduce the scope of the dreaded "FULL JOIN is only supported
> with merge-joinable join conditions" error. (Well, okay, it's not
> *that* dreaded, but people complain about it occasionally.)
Yeah, that would be neat. It might be a lot faster in some cases, too.
> 2. The obvious way to implement this would involve adding an extra bool
> field to struct HashJoinTupleData. The difficulty with that, and the
> reason I'd been resistant to the whole idea, is that it'd eat up a full
> word per hashtable entry because of alignment considerations. (On
> 64-bit machines it'd be free because of alignment considerations, but
> that's cold comfort when 32-bit machines are the ones pressed for
> address space.) But we only need one bit, so what about commandeering
> an infomask bit in the tuple itself? For the initial implementation
> I'd be inclined to take one of the free bits in t_infomask2. We could
> actually get away with overlaying the flag bit with one of the tuple
> visibility bits, since it will only be used in tuples that are in the
> in-memory hash table, which don't need visibility info anymore. But
> that seems like a kluge that could wait until we really need the flag
> space.
I think that's a reasonable approach, although I might be inclined to
do the overlay sooner rather than later if it doesn't add too much
complexity.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company