Обсуждение: pgsql: Change the implementation of hash join to attempt to avoid
pgsql: Change the implementation of hash join to attempt to avoid
От
neilc@svr1.postgresql.org (Neil Conway)
Дата:
Log Message: ----------- Change the implementation of hash join to attempt to avoid unnecessary work if either of the join relations are empty. The logic is: (1) if the inner relation's startup cost is less than the outer relation's startup cost and this is not an outer join, read a single tuple from the inner relation via ExecHash() - if NULL, we're done (2) read a single tuple from the outer relation - if NULL, we're done (3) build the hash table on the inner relation - if hash table is empty and this is not an outer join, we're done (4) otherwise, do hash join as usual The implementation uses the new MultiExecProcNode API, per a suggestion from Tom: invoking ExecHash() now produces the first tuple from the Hash node's child node, whereas MultiExecHash() builds the hash table. I had to put in a bit of a kludge to get the row count returned for EXPLAIN ANALYZE to be correct: since ExecHash() is invoked to return a tuple, and then MultiExecHash() is invoked, we would return one too many tuples to EXPLAIN ANALYZE. I hacked around this by just manually detecting this situation and subtracting 1 from the EXPLAIN ANALYZE row count. Modified Files: -------------- pgsql/src/backend/executor: nodeHash.c (r1.93 -> r1.94) (http://developer.postgresql.org/cvsweb.cgi/pgsql/src/backend/executor/nodeHash.c.diff?r1=1.93&r2=1.94) nodeHashjoin.c (r1.71 -> r1.72) (http://developer.postgresql.org/cvsweb.cgi/pgsql/src/backend/executor/nodeHashjoin.c.diff?r1=1.71&r2=1.72) pgsql/src/include/nodes: execnodes.h (r1.133 -> r1.134) (http://developer.postgresql.org/cvsweb.cgi/pgsql/src/include/nodes/execnodes.h.diff?r1=1.133&r2=1.134)