Обсуждение: BUG #1241: returns different result for the same result with differnt plans.
BUG #1241: returns different result for the same result with differnt plans.
От
"PostgreSQL Bugs List"
Дата:
The following bug has been logged online: Bug reference: 1241 Logged by: Xiaoyu Wang Email address: wangxy@cs.brandeis.edu PostgreSQL version: 7.4.3 Operating system: Mandrake Linux 9.2 3.3.1-2mdk Description: returns different result for the same result with differnt plans. Details: database: TPC-H with scale factor=1.0 query: 13.sql (TPC-H) select c_count, count(*) as custdist from ( select c_custkey, count(o_orderkey) from customer left outer join orders on c_custkey = o_custkey and o_comment not like '%special%requests' group by c_custkey ) as c_orders (c_custkey, c_count) group by c_count order by custdist desc, c_count desc; I ran the query with enable_mergejoin set to on/off, the results are different. Postgres chose Merge Left Join when enable_mergejoin is on and Hash Left Join when it is off. I dumped the results to two files, merge.data and hash.data. Here is the result when I do a diff: diff merge.data hash.data 3d2 < 0 | 50004 41a41 > 0 | 4
"PostgreSQL Bugs List" <pgsql-bugs@postgresql.org> writes: > Description: returns different result for the same result with > differnt plans. > database: TPC-H with scale factor=1.0 > query: 13.sql (TPC-H) This is not *nearly* enough information to let someone else reproduce the problem. (Heck, I can't even tell which answer is wrong.) Could you boil it down to a self-contained test script? regards, tom lane
"PostgreSQL Bugs List" <pgsql-bugs@postgresql.org> writes: > Description: returns different result for the same result with > differnt plans. > I ran the query with enable_mergejoin set to on/off, the results are > different. Postgres chose Merge Left Join when enable_mergejoin is on and > Hash Left Join when it is off. I dumped the results to two files, merge.data > and hash.data. Here is the result when I do a diff: Found it. This has actually been broken ever since we implemented outer joins in PG 7.1. If the join is large enough to need to be broken into multiple hash batches, and some of the inner batches are completely empty, the code would skip processing those batches altogether. Which is fine ... unless it's a LEFT JOIN :-(. Here's the patch for the 7.4 branch: Index: nodeHashjoin.c =================================================================== RCS file: /cvsroot/pgsql-server/src/backend/executor/nodeHashjoin.c,v retrieving revision 1.57.2.1 diff -c -r1.57.2.1 nodeHashjoin.c *** nodeHashjoin.c 25 Nov 2003 19:17:16 -0000 1.57.2.1 --- nodeHashjoin.c 17 Sep 2004 18:23:05 -0000 *************** *** 602,613 **** } /* ! * We can skip over any batches that are empty on either side. Release ! * associated temp files right away. */ while (newbatch <= nbatch && ! (innerBatchSize[newbatch - 1] == 0L || ! outerBatchSize[newbatch - 1] == 0L)) { BufFileClose(hashtable->innerBatchFile[newbatch - 1]); hashtable->innerBatchFile[newbatch - 1] = NULL; --- 602,615 ---- } /* ! * Normally we can skip over any batches that are empty on either side ! * --- but for JOIN_LEFT, can only skip when left side is empty. ! * Release associated temp files right away. */ while (newbatch <= nbatch && ! (outerBatchSize[newbatch - 1] == 0L || ! (innerBatchSize[newbatch - 1] == 0L && ! hjstate->js.jointype != JOIN_LEFT))) { BufFileClose(hashtable->innerBatchFile[newbatch - 1]); hashtable->innerBatchFile[newbatch - 1] = NULL; regards, tom lane
Mark Shewmaker <mark@primefactor.com> writes: > (BTW, I had sent my first reply and this privately as I'm far from sure > of myself in these questions, so but feel free to reply to either > publicly if you want to, or I can re-reply publicly.) cc'd to pgbugs in case anyone else is wondering the same. >> On Fri, 2004-09-17 at 15:10, Tom Lane wrote: >>> Mark Shewmaker <mark@primefactor.com> writes: >>>> the code would skip processing those batches altogether. Which is fine... >>>> unless it's a LEFT JOIN :-(. >> >>> Would the same problem then also exist for right outer joins? >> >> Yup, if the planner chose to flip it into a left join and apply hash >> joining (which is altogether likely --- merge join is the only executor >> join method that supports right join directly, and even then only in >> restricted cases). > Okay, so there doesn't need to be any fix for right joins specifically > then. (I had seen the "!= JOIN_LEFT" in your patch, which made me > wonder whether there was a need for something like a "!= (JOIN_LEFT | > JOIN_RIGHT)" in there. Obviously I'm pretty clueless as to whether > that's really necessary of course.) It's not. If you look at ExecInitHashJoin you'll see that it rejects JOIN_RIGHT, should the planner be buggy enough to ask it to do that. regards, tom lane