Обсуждение: ERROR: FULL JOIN is only supported with merge-joinable join conditions
Hi all, I am porting my application from Ingres to Postgres, and I have the following problem. I am not sure if this is a known limitation of Postgresql or a bug. My code works under Ingres but fails in Postgres with the following error: ERROR: FULL JOIN is only supported with merge-joinable join conditions My tables contain temporal data e.g. Table A: f1 | f2 | modtime | exptime -------------------------- A | B | t0 | t2 <= historical record A | C | t2 | t6 <= historical record A | D | t6 | NULL <= live record Table B: f1 | f2 | modtime | exptime -------------------------- F | G | t1 | t3 <= historical record F | H | t3 | t5 <= historical record F | I | t5 | NULL <= live record All queries on live data are of the form: select * from a where f1 = xx and exptime is NULL A full outer join on two tables with temporal data looks like this: select * from A full outer join B on A.f1 = B.f1 and ((A.ExpTime IS NULL AND B.ExpTime IS NULL) OR (A.ModTime <= B.ExpTime AND (B.ExpTime > A.ExpTime OR B.ExpTime IS NULL))) The primary keys of A and B are (f1, exptime). The join selects the record(s) in B that where live at the same moment as the record in A. Postgres's problem is with the <=, > and is null conditions in the full outer join. These are probably not 'merge-joinable', so the query fails. Shouldn't it try a different method instead of failing?? I cannot move the conditions on exptime to the where clause, because that would allow records in B to match with historical records in A and thus preventing the insertion of a NULL A record. Ok in english :-) If a live record in B does not have a live record in A, it should return as NULL NULL NULL B.f1 B.f2 B.exptime. But if there are historical records in A, the join without exptime would match with a historical record and thus never insert a NULL A record. A 'left outer join' works fine with the above code, so why not a full outer join? Any suggestions or is this a show stopper? Postgres version is 8.1.3. Thanks, Harco
Re: ERROR: FULL JOIN is only supported with merge-joinable join conditions
От
Martijn van Oosterhout
Дата:
On Mon, Mar 13, 2006 at 11:02:35AM +0100, Harco de Hilster wrote: > Hi all, > > I am porting my application from Ingres to Postgres, and I have the > following problem. I am not sure if this is a known limitation of > Postgresql or a bug. My code works under Ingres but fails in Postgres > with the following error: > > ERROR: FULL JOIN is only supported with merge-joinable join conditions Ouch! You've got yourself a nasty one. The technical reason why this doesn't work is because full outer joins in postgres are implemented by a merge join. Basically sort the two tables and then compare them record by record, emitting null columns when necessary. However, this only works when the condition is merge-joinable, usually on ordered datatypes with an equality operator. I think the reason it hasn't been done for general join conditions is because we havn't thought of an efficient algorithm. Basically, for each row in the left table you need to find all matching rows in the right table. Afterwards you have to find all the rows in the right table which havn't been used. i.e. you can emulate it by doing a: SELECT * FROM A LEFT JOIN B ON (X) UNION ALL SELECT * FROM A RIGHT JOIN B ON (X) WHERE A.id IS NULL; But this is double joining and so not terribly efficient. And if the tables were subqueries it would be worse and quite possibly wrong if the output isn't constant. However, I wonder if youre case couldn't be handled with a time-interval datatype such that you condition is merge-joinable on that type. I can't quite see it though... > My tables contain temporal data e.g. > > Table A: > f1 | f2 | modtime | exptime > -------------------------- > A | B | t0 | t2 <= historical record > A | C | t2 | t6 <= historical record > A | D | t6 | NULL <= live record > > Table B: > f1 | f2 | modtime | exptime > -------------------------- > F | G | t1 | t3 <= historical record > F | H | t3 | t5 <= historical record > F | I | t5 | NULL <= live record > > All queries on live data are of the form: select * from a where f1 = xx > and exptime is NULL > > A full outer join on two tables with temporal data looks like this: > > select * > from A > full outer join B on A.f1 = B.f1 and ((A.ExpTime IS NULL AND B.ExpTime > IS NULL) OR (A.ModTime <= B.ExpTime AND (B.ExpTime > A.ExpTime OR > B.ExpTime IS NULL))) Hope this helps, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Вложения
Martijn van Oosterhout <kleptog@svana.org> writes: > I think the reason it hasn't been done for general join conditions is > because we havn't thought of an efficient algorithm. Right, it's keeping track of the unmatched right-hand rows that's a problem. > However, I wonder if youre case couldn't be handled with a > time-interval datatype such that you condition is merge-joinable on > that type. I can't quite see it though... Interesting thought. The use of NULLs in this example is a pretty poor representational choice --- the queries would get a lot simpler if you used "exptime = INFINITY" to represent unexpired data. That doesn't get you all the way to a mergejoinable query though :-( regards, tom lane
Thanks for your thoughts. What is the definition of a merge-joinable condition? Even if I put ExpTime = Infinity (I saw that one coming ;-)), the same error is reported. My only option here is to add A.exptime = B.exptime (which is only true for live data if I use INFINITY), and lose the ability to query historical data. Can I create an type/operator that compares both records that is considered merge-joinable? Kind regards, Harco Tom Lane wrote: >Martijn van Oosterhout <kleptog@svana.org> writes: > > >>I think the reason it hasn't been done for general join conditions is >>because we havn't thought of an efficient algorithm. >> >> > >Right, it's keeping track of the unmatched right-hand rows that's a >problem. > > > >>However, I wonder if youre case couldn't be handled with a >>time-interval datatype such that you condition is merge-joinable on >>that type. I can't quite see it though... >> >> > >Interesting thought. The use of NULLs in this example is a pretty poor >representational choice --- the queries would get a lot simpler if you >used "exptime = INFINITY" to represent unexpired data. That doesn't get >you all the way to a mergejoinable query though :-( > > >
Harco de Hilster <Harco.de.Hilster@ATConsultancy.nl> writes: > What is the definition of a merge-joinable condition? Equality on a sortable datatype. > Can I create an type/operator that compares both records that is > considered merge-joinable? I think you could do something involving a time interval datatype that considers "overlap" as equality and does something reasonable for sorting non-overlapping intervals. It'd take a bit of work ... nothing particularly difficult, just tedious. The "seg" datatype in contrib/ might serve as a prototype. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > Harco de Hilster <Harco.de.Hilster@ATConsultancy.nl> writes: > > What is the definition of a merge-joinable condition? > > Equality on a sortable datatype. > > > Can I create an type/operator that compares both records that is > > considered merge-joinable? > > I think you could do something involving a time interval datatype that > considers "overlap" as equality and does something reasonable for > sorting non-overlapping intervals. How could a non-transitive property ever be merge joinable though? -- greg
Greg Stark <gsstark@mit.edu> writes: > Tom Lane <tgl@sss.pgh.pa.us> writes: >> I think you could do something involving a time interval datatype that >> considers "overlap" as equality and does something reasonable for >> sorting non-overlapping intervals. > How could a non-transitive property ever be merge joinable though? Um, good point. So it couldn't be "overlaps" in the normal sense of the word. But given that he's mostly concerned about cases where one end is +infinity, it might be possible to devise a consistent semantics that works. regards, tom lane