Обсуждение: Enhancing PGSQL to be compatible with Informix SQL
I am currently looking into the possibility of extending the current postgres SQL implementation to be compatible with Informix's SQL implementation. The changes required seem relatively straightforward, with one notable exception. Requirements:1/ Datetime type specifiers (should have no impact) o informix uses datetime specifiers of the form DATETIME YEAR TO HOUR. (which is just the year, month, day and hour portion of a datetime).2/ Intervaltype specifiers (ditto) o informix uses interval specifiers of the form INTERVAL DAY TO HOUR. (whichis just the day and hour portion of an interval).3/ Money type specifiers o informix has money typespecifiers that are akin to decimal speicifiers4/ Informix outer join syntax o informix uses outer joinsof the form SELECT * FROM a, outer b where a.nr = b.nr This will require some post-processing to determine the actual join conditions.5/ serial data type o Serial type must return inserted key value o Unfortunately (and this is the big bad hit) informix's serial datatype does serial number generation ona zero inserted valued. The modification required to do this may have impact on existing programs. I'd be interested if anyone can see any conceptual difficulties i've missed in these definitions, and welcome any concepts on the implementation. .............................Rod +-----------------------------------------------------------------------------+ | Rod Chamberlin | rod@querix.com Tel +44 1703 232345 | | Software Engineer | Mob +44 7803 295406 | | QueriX | Fax +44 1703 399685 | +-----------------------------------------------------------------------------+ | The views expressed in this document do not necessarily represent those of | | the management of QueriX (UK) Ltd. | +-----------------------------------------------------------------------------+
Rod Chamberlin <rod@querix.com> writes: > I am currently looking into the possibility of extending the current > postgres SQL implementation to be compatible with Informix's SQL > implementation. Don Baccus already made the point that we are more interested in being compatible with the standard than with specific commercial implementations, so I won't repeat it. I do have a couple of practical suggestions though: > 1/ Datetime type specifiers (should have no impact) > 2/ Interval type specifiers (ditto) We support enough datestyle variants already that it's hard to believe there isn't one that will meet your needs. But if not, I think adding an "Informix" datestyle option might be considered reasonable. > 5/ serial data type > o Serial type must return inserted key value > o Unfortunately (and this is the big bad hit) > informix's serial datatype does serial number > generation on a zero inserted valued. > The modification required to do this may have > impact on existing programs. Breaking existing applications will not fly. If you have lots of code that depends on this behavior, you could easily emulate it by adding a BEFORE INSERT trigger on each table that needs it. Ignoring the boilerplate, the critical bit would look like: if new.serialcolumn = 0 then new.serialcolumn = nextval('sequenceobject'); However, if you need to know what value is being given to the inserted tuple, much the cleanest solution is to select nextval before inserting: SELECT nextval('sequenceobject');INSERT INTO table VALUES(... , value-you-just-got, ...); If you are always going to do that, then a trigger is a waste of cycles. regards, tom lane
> I am currently looking into the possibility of extending the current > postgres SQL implementation to be compatible with Informix's SQL > implementation. > > The changes required seem relatively straightforward, with one notable > exception. I am very familiar wit Informix. > > Requirements: > 1/ Datetime type specifiers (should have no impact) > o informix uses datetime specifiers of the form > DATETIME YEAR TO HOUR. (which is just the year, > month, day and hour portion of a datetime). I have to admit I usually find this very confusing with Informix. > 2/ Interval type specifiers (ditto) > o informix uses interval specifiers of the form > INTERVAL DAY TO HOUR. (which is just the > day and hour portion of an interval). This I can usually understand, though I think we can do this too clearer than Informix. > 3/ Money type specifiers > o informix has money type specifiers that are akin > to decimal speicifiers We have a MONEY type now, and are looking to invisibly use DECIMAL for this instead. > 4/ Informix outer join syntax > o informix uses outer joins of the form > SELECT * FROM a, outer b where a.nr = b.nr > This will require some post-processing to determine > the actual join conditions. Believe it or not, I am hoping to get this into 7.0. The ANSI syntax requires a lot of optimizer changes, because it basically allows user specification of the join order. In talking to Thomas, we hoped to implement OUTER as a flag on the table that we could easily implement in 7.0. Let's see how it goes. > 5/ serial data type > o Serial type must return inserted key value How does Informix return the value? > o Unfortunately (and this is the big bad hit) > informix's serial datatype does serial number > generation on a zero inserted valued. > The modification required to do this may have > impact on existing programs. Yes, I have been thrown off by this. We don't allow a zero to auto-number. You have to use nextval('sequence_name') in the query to supply the sequence value, not a zero. I can see this as a pain, but the developers think the 0 replace with nextval() thing is strange and non-intuitive. The current behavior fits in the DEFAULT column activation in a logical way. I don't think I can get people to make this change. The 0 replacement is a behind the scenes thing, while DEFAULT and nextval() calls are logically consistent. > I'd be interested if anyone can see any conceptual difficulties i've > missed in these definitions, and welcome any concepts on the > implementation. I agree Informix compatibility is a good thing. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
> However, if you need to know what value is being given to the > inserted tuple, much the cleanest solution is to select nextval > before inserting: > > SELECT nextval('sequenceobject'); > INSERT INTO table VALUES(... , value-you-just-got, ...); > > If you are always going to do that, then a trigger is a waste of cycles. He can do: INSERT INTO table VALUES(... , nextval('sequenceobject'), ...); and currval() will get him the previous nextval() value. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
On Thu, 6 Jan 2000, Tom Lane wrote: > > 1/ Datetime type specifiers (should have no impact) > > 2/ Interval type specifiers (ditto) > > We support enough datestyle variants already that it's hard to believe > there isn't one that will meet your needs. But if not, I think adding > an "Informix" datestyle option might be considered reasonable. Isn't Thomas trying to reduce the number of variants? Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
The Hermit Hacker <scrappy@hub.org> writes: > On Thu, 6 Jan 2000, Tom Lane wrote: >> We support enough datestyle variants already that it's hard to believe >> there isn't one that will meet your needs. But if not, I think adding >> an "Informix" datestyle option might be considered reasonable. > Isn't Thomas trying to reduce the number of variants? He wants to eliminate the essentially-duplicate datatypes, but I didn't think he was proposing eliminating any datestyle functionality... there would be squawks if he did, methinks... regards, tom lane
On Thu, 6 Jan 2000, Tom Lane wrote: > The Hermit Hacker <scrappy@hub.org> writes: > > On Thu, 6 Jan 2000, Tom Lane wrote: > >> We support enough datestyle variants already that it's hard to believe > >> there isn't one that will meet your needs. But if not, I think adding > >> an "Informix" datestyle option might be considered reasonable. > > > Isn't Thomas trying to reduce the number of variants? > > He wants to eliminate the essentially-duplicate datatypes, but I didn't > think he was proposing eliminating any datestyle functionality... > there would be squawks if he did, methinks... 'K, just wanted to clarify that point... Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
On Thu, 6 Jan 2000, Bruce Momjian wrote: > > However, if you need to know what value is being given to the > > inserted tuple, much the cleanest solution is to select nextval > > before inserting: > > > > SELECT nextval('sequenceobject'); > > INSERT INTO table VALUES(... , value-you-just-got, ...); > > > > If you are always going to do that, then a trigger is a waste of cycles. > > He can do: > > INSERT INTO table VALUES(... , nextval('sequenceobject'), ...); > > and currval() will get him the previous nextval() value. > The problem is unfortunately much more generic than this. I would like able to take an informix/4GL program an run it without modification on a postgres backend. The difficulty here is that the database interface *does not know* the datatypes in the insert statement. The problem actually becomes more tricky because the catalog tables don't even know that the original datatype was a serial, so the interface layer cannot take any special steps to pre-process the data. The only other alternative is to write a secondary parser in the interface layer which does the SQL conversion. This strikes me as an exceptionally complex solution given the relative similarity between Informix/SQL and Postgress SQL. .............................Rod +-----------------------------------------------------------------------------+ | Rod Chamberlin | rod@querix.com Tel +44 1703 232345 | | Software Engineer | Mob +44 7803 295406 | | QueriX | Fax +44 1703 399685 | +-----------------------------------------------------------------------------+ | The views expressed in this document do not necessarily represent those of | | the management of QueriX (UK) Ltd. | +-----------------------------------------------------------------------------+
On Thu, 6 Jan 2000, Bruce Momjian wrote: > > I am currently looking into the possibility of extending the current > > postgres SQL implementation to be compatible with Informix's SQL > > implementation. > > > > The changes required seem relatively straightforward, with one notable > > exception. > > I am very familiar wit Informix. > > > > > Requirements: > > 1/ Datetime type specifiers (should have no impact) > > o informix uses datetime specifiers of the form > > DATETIME YEAR TO HOUR. (which is just the year, > > month, day and hour portion of a datetime). > > I have to admit I usually find this very confusing with Informix. I can't disagree. The way informix decided to do DATETIME stuff is definately odd. That said, from a calcualtion standpoint you can pretty much ignore the qualifier during calcualtions, its only really important in the representation. (I'm actually making assumptions here, and it produces considerable work at the representation stages, but that can easily be accomodated). > > > 2/ Interval type specifiers (ditto) > > o informix uses interval specifiers of the form > > INTERVAL DAY TO HOUR. (which is just the > > day and hour portion of an interval). > > This I can usually understand, though I think we can do this too clearer > than Informix. > > > 3/ Money type specifiers > > o informix has money type specifiers that are akin > > to decimal speicifiers > > We have a MONEY type now, and are looking to invisibly use DECIMAL for > this instead. > This would actually be sensible. My comment about money, is that the existing type doesn't have a concept of precision; two decimal places of money is somewhat meaningless where in the local currency it takes 1000 washers to buy a packet of crisps. The ability to set the precision of the MONEY type is kinda important in this case. > > 4/ Informix outer join syntax > > o informix uses outer joins of the form > > SELECT * FROM a, outer b where a.nr = b.nr > > This will require some post-processing to determine > > the actual join conditions. > > Believe it or not, I am hoping to get this into 7.0. The ANSI syntax > requires a lot of optimizer changes, because it basically allows user > specification of the join order. In talking to Thomas, we hoped to > implement OUTER as a flag on the table that we could easily implement in > 7.0. Let's see how it goes. > Sounds great! :) > > 5/ serial data type > > o Serial type must return inserted key value > > How does Informix return the value? > >From a user standpoint it mystically appears in sqlca just after the insert statement is executed. Actually the informix engine recognises it's just done a serial insert, and sends it back in addition to the standard status packets. > > o Unfortunately (and this is the big bad hit) > > informix's serial datatype does serial number > > generation on a zero inserted valued. > > The modification required to do this may have > > impact on existing programs. > > Yes, I have been thrown off by this. We don't allow a zero to > auto-number. You have to use nextval('sequence_name') in the query to > supply the sequence value, not a zero. I can see this as a pain, but > the developers think the 0 replace with nextval() thing is strange and > non-intuitive. The current behavior fits in the DEFAULT column > activation in a logical way. I don't think I can get people to make > this change. The 0 replacement is a behind the scenes thing, while > DEFAULT and nextval() calls are logically consistent. > I can understand the situation here (one of the main reasons I raised the thread in the first place). Above all else the difficulty I have with serial at the moment is the impossibility of differentiating a serial with an int4 after creation (after all the database treats them identically). The catalog tables don't contain any information. The only way you can work out you created a serial column is by looking for an appropriately named sequence in the database on every int4 column that exists (or am I wrong?). This is not exactly something that appeals to me Also, in order to get correct returns from the serial column insert it seems likely that the serial type would have to gain some kind of extra special processing within the database above what it has already. In this case all of the required behaviour could probably be implemented. > > I'd be interested if anyone can see any conceptual difficulties i've > > missed in these definitions, and welcome any concepts on the > > implementation. > > I agree Informix compatibility is a good thing. > > -- > Bruce Momjian | http://www.op.net/~candle > maillist@candle.pha.pa.us | (610) 853-3000 > + If your life is a hard drive, | 830 Blythe Avenue > + Christ can be your backup. | Drexel Hill, Pennsylvania 19026 > .............................Rod +-----------------------------------------------------------------------------+ | Rod Chamberlin | rod@querix.com Tel +44 1703 232345 | | Software Engineer | Mob +44 7803 295406 | | QueriX | Fax +44 1703 399685 | +-----------------------------------------------------------------------------+ | The views expressed in this document do not necessarily represent those of | | the management of QueriX (UK) Ltd. | +-----------------------------------------------------------------------------+
At 11:08 AM 1/6/00 -0500, Bruce Momjian wrote: > >> 4/ Informix outer join syntax >> o informix uses outer joins of the form >> SELECT * FROM a, outer b where a.nr = b.nr >> This will require some post-processing to determine >> the actual join conditions. > >Believe it or not, I am hoping to get this into 7.0. The ANSI syntax >requires a lot of optimizer changes, because it basically allows user >specification of the join order. In talking to Thomas, we hoped to >implement OUTER as a flag on the table that we could easily implement in >7.0. Let's see how it goes. Hmmm...I have to question this wisdom of this, because once in and used there will be pressure to support it forever. How will this play with the SQL 92 syntax? Order specification isn't a bad thing given the fact that outer joins aren't associative (SQL for smarties gives examples). I've been wanting outer joins, but in my porting efforts have managed to work around them without too much difficulty, even though 6.5's limitations on subselects (not in target lists) requires that I create PL/pgSQL functions in some cases. I certainly can't speak for the majority of users, but as one data point I'd personally rather see outer joins done right (SQL 92 syntax) and wait a bit. Then again, I tend to be a bit of a language purist... - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert Serviceand other goodies at http://donb.photo.net.
> > I have to admit I usually find this very confusing with Informix. > > I can't disagree. The way informix decided to do DATETIME stuff is > definately odd. That said, from a calcualtion standpoint you can pretty > much ignore the qualifier during calcualtions, its only really important > in the representation. (I'm actually making assumptions here, and it > produces considerable work at the representation stages, but that can > easily be accomodated). Yes, I don't want to start having explain that mess to people. > > > > > > 2/ Interval type specifiers (ditto) > > > o informix uses interval specifiers of the form > > > INTERVAL DAY TO HOUR. (which is just the > > > day and hour portion of an interval). > > > > This I can usually understand, though I think we can do this too clearer > > than Informix. > > > > > 3/ Money type specifiers > > > o informix has money type specifiers that are akin > > > to decimal speicifiers > > > > We have a MONEY type now, and are looking to invisibly use DECIMAL for > > this instead. > > > > This would actually be sensible. My comment about money, is that the > existing type doesn't have a concept of precision; two decimal places of > money is somewhat meaningless where in the local currency it takes 1000 > washers to buy a packet of crisps. The ability to set the precision of > the MONEY type is kinda important in this case. The move to make MONEY use decimal would add precision. > > > 5/ serial data type > > > o Serial type must return inserted key value > > > > How does Informix return the value? > > > > >From a user standpoint it mystically appears in sqlca just after the > insert statement is executed. Actually the informix engine recognises > it's just done a serial insert, and sends it back in addition to the > standard status packets. Yes, we have currval() which allows such retrieval _inside_ the database, as well as in the application. > I can understand the situation here (one of the main reasons I raised the > thread in the first place). Above all else the difficulty I have with > serial at the moment is the impossibility of differentiating a serial with > an int4 after creation (after all the database treats them identically). > The catalog tables don't contain any information. The only way you can > work out you created a serial column is by looking for an appropriately > named sequence in the database on every int4 column that exists (or am I > wrong?). This is not exactly something that appeals to me Yes, the SERIAL gets lost once it is created. This can cause confusion because doing a \dt on the table shows it as an INT4 with DEFAULT, and not a serial. This can confuse people. I remember someone saying we would need to keep the SERIAL understanding around so we would use it for pg_dump, but I don't remember why we needed to do that. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
> I've been wanting outer joins, but in my porting efforts have managed > to work around them without too much difficulty, even though 6.5's > limitations on subselects (not in target lists) requires that I > create PL/pgSQL functions in some cases. > > I certainly can't speak for the majority of users, but as one data > point I'd personally rather see outer joins done right (SQL 92 > syntax) and wait a bit. > > Then again, I tend to be a bit of a language purist... > Thomas has tried to explain the ANSI syntax for outer joins, and I must say I am quite confused by it. A simple OUTER added before the column name would be a quick and simple way to do outers, perhap get them into 7.0, and allow new users to do outers without having to learn the quite complex ANSI syntax. At least that was my idea. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
On Thu, 6 Jan 2000, Bruce Momjian wrote: > > I've been wanting outer joins, but in my porting efforts have managed > > to work around them without too much difficulty, even though 6.5's > > limitations on subselects (not in target lists) requires that I > > create PL/pgSQL functions in some cases. > > > > I certainly can't speak for the majority of users, but as one data > > point I'd personally rather see outer joins done right (SQL 92 > > syntax) and wait a bit. > > > > Then again, I tend to be a bit of a language purist... > > > > Thomas has tried to explain the ANSI syntax for outer joins, and I must > say I am quite confused by it. A simple OUTER added before the column > name would be a quick and simple way to do outers, perhap get them into > 7.0, and allow new users to do outers without having to learn the quite > complex ANSI syntax. > > At least that was my idea. First, I'm for getting OUTER JOINs in ASAP...but, I'm a little concerned with thought of throwing in what *sounds* like a 'stop gap' measure... Just to clarify..."A simple OUTER added before the column" would be a PostgreSQL-ism? Sort of like Oracle and all the rest have their own special traits? Eventually, the plan is to implement OJs as "SQL92 spec", and leave our -ism in for backwards compatibility? Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
> > Thomas has tried to explain the ANSI syntax for outer joins, and I must > > say I am quite confused by it. A simple OUTER added before the column > > name would be a quick and simple way to do outers, perhap get them into > > 7.0, and allow new users to do outers without having to learn the quite > > complex ANSI syntax. > > > > At least that was my idea. > > First, I'm for getting OUTER JOINs in ASAP...but, I'm a little concerned > with thought of throwing in what *sounds* like a 'stop gap' measure... > > Just to clarify..."A simple OUTER added before the column" would be a > PostgreSQL-ism? Sort of like Oracle and all the rest have their own > special traits? Eventually, the plan is to implement OJs as "SQL92 spec", > and leave our -ism in for backwards compatibility? Yes, OUTER is an Informix-ism. Oracle uses *=. I think the first is easier to add and makes more sense for us. *= could be defined by someone as an operator, and overloading our already complex operator code to do *= for OUTER may be too complex for people to understand. It would be: SELECT *FROM tab1, OUTER tab2WHERE tab1.col1 = tab2.col2 -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
> I've been wanting outer joins, but in my porting efforts have managed > to work around them without too much difficulty, even though 6.5's > limitations on subselects (not in target lists) requires that I > create PL/pgSQL functions in some cases. > I certainly can't speak for the majority of users, but as one data > point I'd personally rather see outer joins done right (SQL 92 > syntax) and wait a bit. A bit of a misunderstanding here: we are using SQL92 syntax but will try to implement the outer join operation using *internal* data structures similar to what we have now. Any alternate syntaxes are just a diversion which slow us down on the road to world domination ;) - Thomas -- Thomas Lockhart lockhart@alumni.caltech.edu South Pasadena, California
On Thu, 6 Jan 2000, Bruce Momjian wrote: > > > Thomas has tried to explain the ANSI syntax for outer joins, and I must > > > say I am quite confused by it. A simple OUTER added before the column > > > name would be a quick and simple way to do outers, perhap get them into > > > 7.0, and allow new users to do outers without having to learn the quite > > > complex ANSI syntax. > > > > > > At least that was my idea. > > > > First, I'm for getting OUTER JOINs in ASAP...but, I'm a little concerned > > with thought of throwing in what *sounds* like a 'stop gap' measure... > > > > Just to clarify..."A simple OUTER added before the column" would be a > > PostgreSQL-ism? Sort of like Oracle and all the rest have their own > > special traits? Eventually, the plan is to implement OJs as "SQL92 spec", > > and leave our -ism in for backwards compatibility? > > Yes, OUTER is an Informix-ism. Oracle uses *=. I think the first is > easier to add and makes more sense for us. *= could be defined by > someone as an operator, and overloading our already complex operator > code to do *= for OUTER may be too complex for people to understand. > > It would be: > > SELECT * > FROM tab1, OUTER tab2 > WHERE tab1.col1 = tab2.col2 What about >2 table joins? Wish I had my book here, but I though tyou could do multiple OUTER joins, no? Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
> > Yes, OUTER is an Informix-ism. Oracle uses *=. I think the first is > > easier to add and makes more sense for us. *= could be defined by > > someone as an operator, and overloading our already complex operator > > code to do *= for OUTER may be too complex for people to understand. > > > > It would be: > > > > SELECT * > > FROM tab1, OUTER tab2 > > WHERE tab1.col1 = tab2.col2 > > What about >2 table joins? Wish I had my book here, but I though tyou > could do multiple OUTER joins, no? SELECT * FROM tab1, OUTER tab2, OUTER tab3 WHERE tab1.col1 = tab2.col2 AND tab1.col3 = tab3.col3 My assumption is that you can't join tab2 to tab3 becaue tab2 is already outer, but I don't know. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
> > I've been wanting outer joins, but in my porting efforts have managed > > to work around them without too much difficulty, even though 6.5's > > limitations on subselects (not in target lists) requires that I > > create PL/pgSQL functions in some cases. > > I certainly can't speak for the majority of users, but as one data > > point I'd personally rather see outer joins done right (SQL 92 > > syntax) and wait a bit. > > A bit of a misunderstanding here: we are using SQL92 syntax but will > try to implement the outer join operation using *internal* data > structures similar to what we have now. > > Any alternate syntaxes are just a diversion which slow us down on the > road to world domination ;) OK, I stand corrected. Let world domination continue. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
At 02:39 PM 1/6/00 -0400, The Hermit Hacker wrote: >Just to clarify..."A simple OUTER added before the column" would be a >PostgreSQL-ism? Sounds like an Informix-ism if I read the thread correctly. > Sort of like Oracle and all the rest have their own >special traits? Though I'm familiar with the Oracle syntax (far too familiar at the moment, as I'm porting literally thousands of lines of queries many of which do Oracle outer joins!), the style described by Bruce seems nicer. - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert Serviceand other goodies at http://donb.photo.net.
At 07:08 PM 1/6/00 +0000, Thomas Lockhart wrote: >> I've been wanting outer joins, but in my porting efforts have managed >> to work around them without too much difficulty, even though 6.5's >> limitations on subselects (not in target lists) requires that I >> create PL/pgSQL functions in some cases. >> I certainly can't speak for the majority of users, but as one data >> point I'd personally rather see outer joins done right (SQL 92 >> syntax) and wait a bit. > >A bit of a misunderstanding here: we are using SQL92 syntax but will >try to implement the outer join operation using *internal* data >structures similar to what we have now. Yes, I've seen the existing code, in particular regarding inner joins. >Any alternate syntaxes are just a diversion which slow us down on the >road to world domination ;) That's my first feeling, too, as I hope I made clear. If you don't mind my asking, just what are the difficulties? Bruce mentioned the optimizer. I noticed the executor code that does merge joins has conditionalized stuff in it to insert the nulls required by outer join. And the parser has conditionalized stuff to deal with them. So, is it ("just", he says :) the optimizer, or more? - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert Serviceand other goodies at http://donb.photo.net.
> >A bit of a misunderstanding here: we are using SQL92 syntax but will > >try to implement the outer join operation using *internal* data > >structures similar to what we have now. > > Yes, I've seen the existing code, in particular regarding inner > joins. > > >Any alternate syntaxes are just a diversion which slow us down on the > >road to world domination ;) > > That's my first feeling, too, as I hope I made clear. > > If you don't mind my asking, just what are the difficulties? Bruce > mentioned the optimizer. I noticed the executor code that does > merge joins has conditionalized stuff in it to insert the nulls > required by outer join. And the parser has conditionalized stuff > to deal with them. > > So, is it ("just", he says :) the optimizer, or more? OK, let me summarize where we are. Thomas is the man on this. Thomas is doing the ANSI syntax in gram.y and passing information around in the parser. We then need code in the executor for Merge/Hash/Nested Loop joins to do outer joins. The requirement in the optimizer is to have the _outer_ column always in the left/outer position in hash/nested loop joins. Mergejoin can have it either place. The ANSI syntax also specifies the exact join that gets the outer, and I am not sure how to get that information/control into the optimizer. Thomas is now redesigning the parser _outer_ code to pass around the outer information in a better way than his first cut at the code. That is where we are. There are many people ready to get involved when there is a need. I know many want this in 7.0. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
> If you don't mind my asking, just what are the difficulties? Bruce > mentioned the optimizer. I noticed the executor code that does > merge joins has conditionalized stuff in it to insert the nulls > required by outer join. And the parser has conditionalized stuff > to deal with them. The conditional stuff is from my poking at it over the last few months. OK, the difficulties are (I'll probably leave something out): 1) The parser is written to handle the traditional inner join syntax, which separates the FROM and WHERE clauses into two distinct pieces. The outer join syntax (which of course can also do inner joins) has qualifiers and table and column "aliases" buried down in the FROM clause, and it is a pain to percolate that back up as it is transformed by the parser backend. 2) The optimizer usually feels free to try every combination of inner joins, since they are completely transitive. But outer joins are not: they need to be done in a specific order since the *absence* of a match is significant. 3) The executor needs to understand how to expand a left- or right-side tuple into a null-filled result. I've played with the mergejoin code and have taught it to walk the tables correctly, but it needs code added which actually generates the result tuple. And the other join methods don't know anything about outer joins yet. Enough? - Thomas -- Thomas Lockhart lockhart@alumni.caltech.edu South Pasadena, California
On Thu, 6 Jan 2000, Bruce Momjian wrote: [snip] > The move to make MONEY use decimal would add precision. > > > > > 5/ serial data type > > > > o Serial type must return inserted key value > > > > > > How does Informix return the value? > > > > > > > >From a user standpoint it mystically appears in sqlca just after the > > insert statement is executed. Actually the informix engine recognises > > it's just done a serial insert, and sends it back in addition to the > > standard status packets. > > Yes, we have currval() which allows such retrieval _inside_ the > database, as well as in the application. > Yes, but the interface cannot tell what it's operating on, so it doesn't know to fetch curval; consider the following statement: insert into mytable values('Hello',0,0,23,17.0,0.0); Are any of the inserted values insert into serial columns? You have no way of knowing. In fact any one of the last 5 columsn could potentially be serial values being inserted (although if it's the third or forth column we don't need to do any extra processing (*)). In the same way the interface layer can see the SQL statement and not know if it has to do any extra work for informix compatibility in terms of fetching the extra values back from the sequence which Postgres has created for us. (*) Actually we probably do, since we need to ensure that the sequence value has passed the inserted value if we do a non-null insert on a serial column, otherwise we may later regenerate the same serial number. The above example is a relatively simple one to parse and analyze. A more complicated case that we'd also probably have to recognise would be something like select x,y,z,p+1 from base_table insert into mytable short of having an SQL parser how are you supposed to determine the required behaviour? There are other issues with serial which suggest that better processing is probably required; they are currently completely useful in the context of temporary tables, since the underlying sequence is never dropped. > > > I can understand the situation here (one of the main reasons I raised the > > thread in the first place). Above all else the difficulty I have with > > serial at the moment is the impossibility of differentiating a serial with > > an int4 after creation (after all the database treats them identically). > > The catalog tables don't contain any information. The only way you can > > work out you created a serial column is by looking for an appropriately > > named sequence in the database on every int4 column that exists (or am I > > wrong?). This is not exactly something that appeals to me > > Yes, the SERIAL gets lost once it is created. This can cause confusion > because doing a \dt on the table shows it as an INT4 with DEFAULT, and > not a serial. This can confuse people. I remember someone saying we > would need to keep the SERIAL understanding around so we would use it > for pg_dump, but I don't remember why we needed to do that. > This is odd actually. I can't see why you'd need to do it either, since you must already have the information you need to recreate the thing. The confusion though is not that I can't work out it's a serial, but that a program can't work out it's a serial. .............................Rod +-----------------------------------------------------------------------------+ | Rod Chamberlin | rod@querix.com Tel +44 1703 232345 | | Software Engineer | Mob +44 7803 295406 | | QueriX | Fax +44 1703 399685 | +-----------------------------------------------------------------------------+ | The views expressed in this document do not necessarily represent those of | | the management of QueriX (UK) Ltd. | +-----------------------------------------------------------------------------+
> > Yes, we have currval() which allows such retrieval _inside_ the > > database, as well as in the application. > > > > Yes, but the interface cannot tell what it's operating on, so it doesn't > know to fetch curval; consider the following statement: > > insert into mytable values('Hello',0,0,23,17.0,0.0); > > Are any of the inserted values insert into serial columns? > > You have no way of knowing. In fact any one of the last 5 columsn could > potentially be serial values being inserted (although if it's the third > or forth column we don't need to do any extra processing (*)). In the same > way the interface layer can see the SQL statement and not know if it has > to do any extra work for informix compatibility in terms of fetching the > extra values back from the sequence which Postgres has created for us. > > (*) Actually we probably do, since we need to ensure that the sequence > value has passed the inserted value if we do a non-null insert on a serial > column, otherwise we may later regenerate the same serial number. Yes, I see your point, and the fault is that Informix is doing some special things when 0 is inserted into the SERIAL column type. By doing defaults and using that, we are being more constent. With the Informix solution, we are losing information. It is probably a good argument _not_ to implement the informix slight-of-hand. However, I also see your huge problem because we don't document the SERIAL, and we don't allow zero to trigger a nextval(). Very tough. > > Yes, the SERIAL gets lost once it is created. This can cause confusion > > because doing a \dt on the table shows it as an INT4 with DEFAULT, and > > not a serial. This can confuse people. I remember someone saying we > > would need to keep the SERIAL understanding around so we would use it > > for pg_dump, but I don't remember why we needed to do that. > > > > This is odd actually. I can't see why you'd need to do it either, since > you must already have the information you need to recreate the thing. > > The confusion though is not that I can't work out it's a serial, but > that a program can't work out it's a serial. SERIAL was implemented as a nice workaround to prevent people from defining a sequance and defining a default nextval(). I think I may have suggested it because of my Informix background. The issue is that SERIAL is just a shortcut. It doesn't have any internal representation. It would need one only for pg_dump and for your use, and I am not sure that is warranted. Other people would have to agree that keeping the SERIAL as its own type is good. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026