Обсуждение: TRANSACTIONS
Hi all, The transactions should be the way to distinguish a relational database from others no-relational databases, (MySQL is the right example). We are very proud of PostgreSQL transactions but seems that it doesn't work in the right way. It shoud be important to be sure that PostgreSQL is compliant with SQL92. I need absolutely to use transactions but until now I could not use it, in my case it is completely unusable. I tried transactions in other databases and I compared it with PostgreSQL and no one of which I tried has the same PostgreSQL behavior. I tried the following script: ------------------------------------------------------- PostgreSQL: ------------------------------------------------------- begin transaction; create table tmp(a int); insert into tmp values (1); insert into tmp values (1000000000000000000000000000000000); ERROR: pg_atoi: error reading "1000000000000000000000000000000000": Numerical result out of range commit; select * from tmp; ERROR: tmp: Table does not exist. ------------------------------------------------------- Interbase, Oracle,Informix,Solid,Ms-Access,DB2: ------------------------------------------------------- connect hygea.gdb; create table temp(a int); insert into temp values (1); insert into temp values (1000000000000000000000000000000000); commit; select * from temp; arithmetic exception, numeric overflow, or string truncation A =========== 1 I would like to know what the Standard says and who is in the rigth path PostgreSQL or the others, considering the two examples reported below. Comments? -- Jose' Soares Bologna, Italy Jose@sferacarta.com
On 22-Feb-2000 Jose Soares wrote: > begin transaction; > create table tmp(a int); > insert into tmp values (1); > insert into tmp values (1000000000000000000000000000000000); > ERROR: pg_atoi: error reading "1000000000000000000000000000000000": > Numerical result out of range > commit; > select * from tmp; > ERROR: tmp: Table does not exist. > ------------------------------------------------------- > Interbase, Oracle,Informix,Solid,Ms-Access,DB2: ^^^^^^^^^ AFAIK, MS Access have no transactions inside it, Informix (at least old versions I worked with) always perform create,drop, alter object outside transaction but IMHO it's not right behavior. I believe postgres's behavior more meaningful, but IMHO, this example is quite far from real life. -- Dmitry Samersoff, dms@wplus.net, ICQ:3161705 http://devnull.wplus.net * There will come soft rains ...
Jose Soares <jose@sferacarta.com> writes: > ------------------------------------------------------- > Interbase, Oracle,Informix,Solid,Ms-Access,DB2: > ------------------------------------------------------- > connect hygea.gdb; > create table temp(a int); > insert into temp values (1); > insert into temp values (1000000000000000000000000000000000); > commit; > select * from temp; > arithmetic exception, numeric overflow, or string truncation > A > =========== > 1 > I would like to know what the Standard says and who is in the rigth path > PostgreSQL or the others, considering the two examples reported below. I think those other guys are unquestionably failing to conform to SQL92. 6.10 general rule 3.a says a) If SD is exact numeric or approximate numeric, then Case: i) If there is a representation of SV in the data type TD that does not lose any leading significant digits after rounding or truncating if necessary, then TV is that rep- resentation. The choice of whether to round or truncate is implementation-defined. ii) Otherwise, an exception condition is raised: data exception- numeric value out of range. and 3.3.4.1 says The phrase "an exception condition is raised:", followed by the name of a condition, is used in General Rules and elsewhere to indicate that the execution of a statement is unsuccessful, ap- plication of General Rules, other than those of Subclause 12.3, "<procedure>", and Subclause 20.1, "<direct SQL statement>", may be terminated, diagnostic information is to be made available, and execution of the statement is to have no effect on SQL-data or schemas. The effect on <target specification>s and SQL descriptor areas of an SQL-statement that terminates with an exception condi- tion, unless explicitly defined by this International Standard, is implementation-dependent. I see no way that allowing the transaction to commit after an overflow can be called consistent with the spec. regards, tom lane
At 11:32 AM 2/22/00 -0500, Tom Lane wrote: >I see no way that allowing the transaction to commit after an overflow >can be called consistent with the spec. You are absolutely right. The whole point is that either a) everything commits or b) nothing commits. Having some kinds of exceptions allow a partial commit while other exceptions rollback the transaction seems like a very error-prone programming environment to me. - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert Service and other goodies at http://donb.photo.net.
At 12:47 PM 22-02-2000 +0100, Jose Soares wrote: >begin transaction; >create table tmp(a int); >insert into tmp values (1); >insert into tmp values (1000000000000000000000000000000000); >ERROR: pg_atoi: error reading "1000000000000000000000000000000000": >Numerical result out of range >commit; >select * from tmp; >ERROR: tmp: Table does not exist. >------------------------------------------------------- >Interbase, Oracle,Informix,Solid,Ms-Access,DB2: >------------------------------------------------------- >connect hygea.gdb; >create table temp(a int); >insert into temp values (1); >insert into temp values (1000000000000000000000000000000000); >commit; >select * from temp; > >arithmetic exception, numeric overflow, or string truncation > > A >=========== > 1 Stuff done in a transaction cannot be committed if there is an error. So looks like Postgres is right and the rest are wrong ;). Also I believe Oracle does a commit behind your back whenever you do a create table or stuff like that. However I did have problems rolling back a create table in Postgres before- after rolling back I could not recreate a table of the same name. I had to manually unlink the table at filesystem level. Not sure if that has been fixed. On a different note I wonder if there could be layers of transactions (without having to create two separate connections).. Begin transaction A Try to do transaction B Depending on whether B succeeds or fails we do the following stuff differently blahblahblah If blahblablah fails then rollback the whole thingy, including nested transaction B (even if "committed") commit transaction A Sounds like a headache to implement tho (performance hits etc), and probably more an academic feature than anything. So I'm just wondering just for the sake of wondering ;). If we go that way lots of people will have a new toy to play with (to sell as well) and things will get even more complex.. <grin>. Cheerio, Link.
Dmitry Samersoff wrote: > On 22-Feb-2000 Jose Soares wrote: > > begin transaction; > > create table tmp(a int); > > insert into tmp values (1); > > insert into tmp values (1000000000000000000000000000000000); > > ERROR: pg_atoi: error reading "1000000000000000000000000000000000": > > Numerical result out of range > > commit; > > select * from tmp; > > ERROR: tmp: Table does not exist. > > ------------------------------------------------------- > > Interbase, Oracle,Informix,Solid,Ms-Access,DB2: > ^^^^^^^^^ > AFAIK, MS Access have no transactions inside it, > Informix (at least old versions I worked with) always > perform create,drop, alter object outside transaction > but IMHO it's not right behavior. I don't know and I don't care about old software, I'm talking about Ms_Access97 and Informix 8. -- Jose' Soares Bologna, Italy Jose@sferacarta.com
Sorry for my english, Tom, but the point is another, I'm talking about transactions not about error messages. This is only a stupid example how to abort a transaction, PostgreSQL aborts automatically transactions if an error occurs, even an warning or a syntax error. I can believe that all other databases are wrong and only we (PostgreSQL) are right, but please try to understand me. This is not easy to believe anyway. I'm looking for another database with a behavior like PostgreSQL but I can't find it, and I tried a lot of them until now. Do you know some database with transactions like PostgreSQL? Tom Lane wrote: > Jose Soares <jose@sferacarta.com> writes: > > ------------------------------------------------------- > > Interbase, Oracle,Informix,Solid,Ms-Access,DB2: > > ------------------------------------------------------- > > connect hygea.gdb; > > create table temp(a int); > > insert into temp values (1); > > insert into temp values (1000000000000000000000000000000000); > > commit; > > select * from temp; > > > arithmetic exception, numeric overflow, or string truncation > > > A > > =========== > > 1 > > > I would like to know what the Standard says and who is in the rigth path > > PostgreSQL or the others, considering the two examples reported below. > > I think those other guys are unquestionably failing to conform to SQL92. > 6.10 general rule 3.a says > > a) If SD is exact numeric or approximate numeric, then > > Case: > > i) If there is a representation of SV in the data type TD > that does not lose any leading significant digits after > rounding or truncating if necessary, then TV is that rep- > resentation. The choice of whether to round or truncate is > implementation-defined. > > ii) Otherwise, an exception condition is raised: data exception- > numeric value out of range. > > and 3.3.4.1 says > > The phrase "an exception condition is raised:", followed by the > name of a condition, is used in General Rules and elsewhere to > indicate that the execution of a statement is unsuccessful, ap- > plication of General Rules, other than those of Subclause 12.3, > "<procedure>", and Subclause 20.1, "<direct SQL statement>", may > be terminated, diagnostic information is to be made available, > and execution of the statement is to have no effect on SQL-data or > schemas. The effect on <target specification>s and SQL descriptor > areas of an SQL-statement that terminates with an exception condi- > tion, unless explicitly defined by this International Standard, is > implementation-dependent. > > I see no way that allowing the transaction to commit after an overflow > can be called consistent with the spec. > > regards, tom lane > > ************ -- Jose' Soares Bologna, Italy Jose@sferacarta.com
Don Baccus wrote: > At 11:32 AM 2/22/00 -0500, Tom Lane wrote: > > >I see no way that allowing the transaction to commit after an overflow > >can be called consistent with the spec. > > You are absolutely right. The whole point is that either a) everything > commits or b) nothing commits. > > Having some kinds of exceptions allow a partial commit while other > exceptions rollback the transaction seems like a very error-prone > programming environment to me. > It is hard to believe all world is wrong and only we are right. Isn't it ? ;) > > - Don Baccus, Portland OR <dhogaza@pacifier.com> > Nature photos, on-line guides, Pacific Northwest > Rare Bird Alert Service and other goodies at > http://donb.photo.net. > > ************ -- Jose' Soares Bologna, Italy Jose@sferacarta.com
>Sorry for my english, Tom, but the point is another, I'm talking >about transactions not about error messages. > >This is only a stupid example how to abort a transaction, PostgreSQL >aborts automatically transactions if an error occurs, even an warning >or a syntax error. > >I can believe that all other databases are wrong and only we >(PostgreSQL) are right, but please try to understand me. This is not >easy to believe anyway. > >I'm looking for another database with a behavior like PostgreSQL but >I can't find it, and I tried a lot of them until now. > >Do you know some database with transactions like PostgreSQL? I personally don't feel qualified to interpret the standard. But I would like to pipe in a little on the issue of what is desirable. By default, as a developer, I would be quite unhappy with the behavior of those other databases (allowing a commit after an insert has failed). If I do a bulk copy into an existing database, and one copy fails, that sort of behavior could concievably render my database unusable with not possibility of recovery. So in that sense, from the point of view of desirability I think postgres got it right. But then I thought about if from a programming language point of view. Consider the following code (I use perl/DBI as an example). ========================= example ========================= $dbh->{AutoCommit} = 0; $dbh->do("CREATE TABLE tmp (a int unique,b int)"); while (<>){ if (/([0-9]+) ([0-9]+)/) { $rtv = $dbh->do("INSERT INTO tmp VALUES ($1,$2)"); if ($rtv) {$dbh->do("UPDATE tmp SET b=$2 where a=$1")}; } } $dbh->commit; $dbh->disconnect; ========================= end ============================ This incorporates a very common idiom within a transaction block. Of course, this fails. As far as I can tell from the preceding discussion, there is no way to "sanitize" the transaction once you have fixed the error. IMHO, it would be EXTREMELY useful to be able to implement the above transaction. But not by default. I'm not sure what a resonable syntax would be - several come to mind. You could have "SANITIZE TRANSACTION" or "\unset warning", whatever, the exact syntax matters little to me. But without this sort of capability, people who do programatic error checking and correction (which seems like a good thing) are essentially penalized because they cannot effectively use transactions. Apologies if it is already possible to do this. -- Karl DeBisschop <kdebisschop@alert.infoplease.com> 617.832.0332 (Fax: 617.956.2696) Information Please - your source for FREE online reference http://www.infoplease.com - Your Ultimate Fact Finder http://kids.infoplease.com - The Great Homework Helper Netsaint Plugins Development http://netsaintplug.sourceforge.net
Karl DeBisschop wrote: > > >Sorry for my english, Tom, but the point is another, I'm talking > >about transactions not about error messages. > > > >This is only a stupid example how to abort a transaction, PostgreSQL > >aborts automatically transactions if an error occurs, even an warning > >or a syntax error. > > > >I can believe that all other databases are wrong and only we > >(PostgreSQL) are right, but please try to understand me. This is not > >easy to believe anyway. > > > >I'm looking for another database with a behavior like PostgreSQL but > >I can't find it, and I tried a lot of them until now. > > > >Do you know some database with transactions like PostgreSQL? > > I personally don't feel qualified to interpret the standard. But I > would like to pipe in a little on the issue of what is desirable. > > By default, as a developer, I would be quite unhappy with the behavior > of those other databases (allowing a commit after an insert has > failed). If I do a bulk copy into an existing database, and one copy > fails, that sort of behavior could concievably render my database > unusable with not possibility of recovery. So in that sense, from the > point of view of desirability I think postgres got it right. > > But then I thought about if from a programming language point of > view. Consider the following code (I use perl/DBI as an example). > > ========================= example ========================= > > $dbh->{AutoCommit} = 0; > $dbh->do("CREATE TABLE tmp (a int unique,b int)"); > while (<>){ > if (/([0-9]+) ([0-9]+)/) { > $rtv = $dbh->do("INSERT INTO tmp VALUES ($1,$2)"); > if ($rtv) {$dbh->do("UPDATE tmp SET b=$2 where a=$1")}; > } > } > $dbh->commit; > $dbh->disconnect; > > ========================= end ============================ > > This incorporates a very common idiom within a transaction block. Of > course, this fails. As far as I can tell from the preceding > discussion, there is no way to "sanitize" the transaction once you > have fixed the error. IMHO, it would be EXTREMELY useful to be able to > implement the above transaction. But not by default. > > I'm not sure what a resonable syntax would be - several come to mind. > You could have "SANITIZE TRANSACTION" or "\unset warning", whatever, > the exact syntax matters little to me. But without this sort of > capability, people who do programatic error checking and correction > (which seems like a good thing) are essentially penalized because they > cannot effectively use transactions. > To continue with your example, this should work: > $dbh->{AutoCommit} = 0; > $dbh->do("CREATE TABLE tmp (a int unique,b int)"); > while (<>){ > if (/([0-9]+) ([0-9]+)/) { > eval{$rtv = $dbh->do("INSERT INTO tmp VALUES ($1,$2)")}; > if ($@) {$dbh->do("UPDATE tmp SET b=$2 where a=$1")}; > } > } > $dbh->commit; > $dbh->disconnect; Sadly, it does not, as far as I can tell. In fact, it seems to corrupt the database to where you can't create the table tmp anymore, on my system. I certainly never get a table. What's the rationale behind having the database blow out eval's error trapping? Can't see where letting a program recover from an error in a statement compromises atomicity. > Apologies if it is already possible to do this. > Likewise.
To summarize, I stated that the following does not work with postgresql: > $dbh->{AutoCommit} = 0; > $dbh->do("CREATE TABLE tmp (a int unique,b int)"); > while (<>){ > if (/([0-9]+) ([0-9]+)/) { > $rtv = $dbh->do("INSERT INTO tmp VALUES ($1,$2)"); > if ($rtv) {$dbh->do("UPDATE tmp SET b=$2 where a=$1")}; > } > } > $dbh->commit; > $dbh->disconnect; I further said that regardless of what the SQL standard gurus decide, I felt that postgresql currently gives desirable behavior - once a transaction is started, it's either all or nothing. But then I qualified that by saying I'd like somehow to be able to "sanitize" the transaction so that the common idiom above could be made to work. From my examination, the difference between our two examples is Original: KD> $rtv = $dbh->do("INSERT INTO tmp VALUES ($1,$2)"); Modified: KM> eval{$rtv = $dbh->do("INSERT INTO tmp VALUES ($1,$2)")}; From the point of view if the DBMS, i believe these are identical - in both cases the query is issued to the DMBS and the overall transaction becomes "contaminated". And as I said before, this is exactly what I'd like to have happen in the default case. It's not that eval's error trapping is blown out - it's that the transaction defined by the AutoCommit cannot complete because a part of it cannot complete -- that's what atomicity means. At least that's the way it looks to me. But as I started out saying, I don't feel qualified to interpret the standard - I might be wrong, plain and simple. -- Karl DeBisschop <kdebisschop@alert.infoplease.com> 617.832.0332 (Fax: 617.956.2696) Information Please - your source for FREE online reference http://www.infoplease.com - Your Ultimate Fact Finder http://kids.infoplease.com - The Great Homework Helper Netsaint Plugins Development http://netsaintplug.sourceforge.net
At 02:16 PM 24-02-2000 -0500, Karl DeBisschop wrote: > >To summarize, I stated that the following does not work with >postgresql: > >> $dbh->{AutoCommit} = 0; >> $dbh->do("CREATE TABLE tmp (a int unique,b int)"); >> $rtv = $dbh->do("INSERT INTO tmp VALUES ($1,$2)"); >> if ($rtv) {$dbh->do("UPDATE tmp SET b=$2 where a=$1")}; >> $dbh->commit; >> $dbh->disconnect; > >It's not that eval's error trapping is blown out - it's that the >transaction defined by the AutoCommit cannot complete because a part >of it cannot complete -- that's what atomicity means. Maybe I don't understand the situation. But it doesn't seem to be a big problem. With postgres you have ensure that your application filters the data properly before sticking it into the database. Then if the insert fails, it's probably a serious database problem and in that case it's best that the whole transaction is aborted anyway. It indeed is a problem if the database engine is expected to parse the data. For example - if you send in a date value, and the database engine chokes on it. With the nonpostgresql behaviour you can still insert a NULL instead for "Bad date/ Unknown date". But from the security point of view it is best to reduce the amount of parsing done by the database engine. Make sure the app sanitises and massages everything so that the database has no problems with the data. It can be a pain sometimes to figure out what the database can take (which is why I've been asking for the limits for Postgresql fields and such- so the app can keep everything within bounds or grumble to the user/vandal). Once everything is set up nicely, if the database grumbles then the app screwed up somehow (the vandal got through) and it's best to rollback everything (we're lucky if the database just grumbled). Cheerio, Link.
>>To summarize, I stated that the following does not work with >>postgresql: >> >>> $dbh->{AutoCommit} = 0; >>> $dbh->do("CREATE TABLE tmp (a int unique,b int)"); >>> $rtv = $dbh->do("INSERT INTO tmp VALUES ($1,$2)"); >>> if ($rtv) {$dbh->do("UPDATE tmp SET b=$2 where a=$1")}; >>> $dbh->commit; >>> $dbh->disconnect; >> >>It's not that eval's error trapping is blown out - it's that the >>transaction defined by the AutoCommit cannot complete because a part >>of it cannot complete -- that's what atomicity means. > >Maybe I don't understand the situation. But it doesn't seem to be a big >problem. > >With postgres you have ensure that your application filters the data >properly before sticking it into the database. Then if the insert fails, >it's probably a serious database problem and in that case it's best that >the whole transaction is aborted anyway. This reason this idiom is used has nothing to do with validation. I agree that the application has the resopnsibility to cehck for valid data. The usefulness of the idion is that in a mutli-user environment, this is a basic way to update data that may or may not already have a key in the table. You can't do a "SELECT COUNT" because in the time between when you SELECT and INSERT (assuming the key is not already there) someone may have done a separate insert. The only other way I know to do this is to lock the entire table against INSERTs which has obvious performance effects. -- Karl DeBisschop <kdebisschop@alert.infoplease.com> 617.832.0332 (Fax: 617.956.2696) Information Please - your source for FREE online reference http://www.infoplease.com - Your Ultimate Fact Finder http://kids.infoplease.com - The Great Homework Helper Netsaint Plugins Development http://netsaintplug.sourceforge.net
On Fri, 25 Feb 2000, Karl DeBisschop wrote: > > >>To summarize, I stated that the following does not work with > >>postgresql: > >> > >>> $dbh->{AutoCommit} = 0; > >>> $dbh->do("CREATE TABLE tmp (a int unique,b int)"); > >>> $rtv = $dbh->do("INSERT INTO tmp VALUES ($1,$2)"); > >>> if ($rtv) {$dbh->do("UPDATE tmp SET b=$2 where a=$1")}; > >>> $dbh->commit; > >>> $dbh->disconnect; > >> > > The usefulness of the idion is that in a mutli-user environment, this > is a basic way to update data that may or may not already have a key > in the table. You can't do a "SELECT COUNT" because in the time > between when you SELECT and INSERT (assuming the key is not already > there) someone may have done a separate insert. The only other way I > know to do this is to lock the entire table against INSERTs which has > obvious performance effects. sounds right, but ;-) why you use the transaction in the first place?
Karl DeBisschop wrote: > > To summarize, I stated that the following does not work with > postgresql: > > > $dbh->{AutoCommit} = 0; > > $dbh->do("CREATE TABLE tmp (a int unique,b int)"); > > while (<>){ > > if (/([0-9]+) ([0-9]+)/) { > > $rtv = $dbh->do("INSERT INTO tmp VALUES ($1,$2)"); > > if ($rtv) {$dbh->do("UPDATE tmp SET b=$2 where a=$1")}; > > } > > } > > $dbh->commit; > > $dbh->disconnect; > > I further said that regardless of what the SQL standard gurus decide, > I felt that postgresql currently gives desirable behavior - once a > transaction is started, it's either all or nothing. But then I > qualified that by saying I'd like somehow to be able to "sanitize" the > transaction so that the common idiom above could be made to work. > > >From my examination, the difference between our two examples is > > Original: > KD> $rtv = $dbh->do("INSERT INTO tmp VALUES ($1,$2)"); > > Modified: > KM> eval{$rtv = $dbh->do("INSERT INTO tmp VALUES ($1,$2)")}; > > >From the point of view if the DBMS, i believe these are identical - in > both cases the query is issued to the DMBS and the overall transaction > becomes "contaminated". And as I said before, this is exactly what > I'd like to have happen in the default case. > > It's not that eval's error trapping is blown out - it's that the > transaction defined by the AutoCommit cannot complete because a part > of it cannot complete -- that's what atomicity means. I don't have the SQL92 standard with me, so I can't speak to how it defines atomicity. Seems to me it's a means to an end, though, the end being that all of the statements in the sequence are performed, or none. But if the program traps an error, then does something to recover, you could argue that it's changed the sequence. As long as the program has to explicitly Commit, why not? It seems desirable to me that if one statement causes an error, it doesn't affect the database, and the error is returned to the client. If the client has RaiseError on, which he should, and doesn't do anything to explicitly trap, it's going to blow out the program and thus the transaction should be rolled back, which is a good thing. But if he does explicitly trap, as I do above, why not let him stay within the transaction, since the statement in error has not done anything? I agree that do get Postgresql to do this might be a lot to expect (nested transactions are required, I guess). I'm just not sure that it's a *wrong*, or non-conformant, thing to expect. (By the way, I know VB/Access does it this way. My production code, however, never takes advantage of this, to my knowledge.) Addressing Lincoln Yeoh's point in another post, to take the approach that all your data should conform to all database requirements before you enter a transaction seems to me to lead to redundancy: the program code checks and the database checks. Should you have to synchronize all relevant code every time a field requirement is changed? I agree that to simply continue without error and let the program blindly commit, which some folks claim other databases do, is wrong and screws atomicity. What is also wrong is to allow you to do a Commit when the database is in an error state, so that you have (in this case) a table in limbo that can't be created or seen, behavior that Jose Soares and I both saw with Postgresql (6.5.1 in my case). Why shouldn't Postgresql just implicitly Rollback at this point, since you can't do anything (constructive) to the database within the transaction anyway?
> From: <kaiq@realtyideas.com> > On Fri, 25 Feb 2000, Karl DeBisschop wrote: > > > > > >>To summarize, I stated that the following does not work with > > >>postgresql: > > >> > > >>> $dbh->{AutoCommit} = 0; > > >>> $dbh->do("CREATE TABLE tmp (a int unique,b int)"); > > >>> $rtv = $dbh->do("INSERT INTO tmp VALUES ($1,$2)"); > > >>> if ($rtv) {$dbh->do("UPDATE tmp SET b=$2 where a=$1")}; > > >>> $dbh->commit; > > >>> $dbh->disconnect; > > >> > > > > The usefulness of the idion is that in a mutli-user environment, this > > is a basic way to update data that may or may not already have a key > > in the table. You can't do a "SELECT COUNT" because in the time > > between when you SELECT and INSERT (assuming the key is not already > > there) someone may have done a separate insert. The only other way I > > know to do this is to lock the entire table against INSERTs which has > > obvious performance effects. > sounds right, but ;-) why you use the transaction in the first place? Rememeber that this is just an example to illustrate what sort of behaviour one user would find useful in tranasctions, so it is a little simplistic. Not overly simplistic, though, I think. I'd want a transaction because I'm doing a bulk insert into this live database - say syncing in a bunch of data from a slave server while the master is still running. If one (or more) insert(s) fail, I want to revert back to the starting pint so I can fix the cause of the failed insert and try again with the database in a known state. (there may, for instance, be relationships beteewn the b field such that if only part of the bulk insert suceeds, the database is rendered corrupt). -- Karl DeBisschop <kdebisschop@alert.infoplease.com> 617.832.0332 (Fax: 617.956.2696) Information Please - your source for FREE online reference http://www.infoplease.com - Your Ultimate Fact Finder http://kids.infoplease.com - The Great Homework Helper Netsaint Plugins Development http://netsaintplug.sourceforge.net
> From: "Keith G. Murphy" <keithmur@mindspring.com> > > Karl DeBisschop wrote: > > > > To summarize, I stated that the following does not work with > > postgresql: > > > > > $dbh->{AutoCommit} = 0; > > > $dbh->do("CREATE TABLE tmp (a int unique,b int)"); > > > while (<>){ > > > if (/([0-9]+) ([0-9]+)/) { > > > $rtv = $dbh->do("INSERT INTO tmp VALUES ($1,$2)"); > > > if ($rtv) {$dbh->do("UPDATE tmp SET b=$2 where a=$1")}; > > > } > > > } > > > $dbh->commit; > > > $dbh->disconnect; > > > > I further said that regardless of what the SQL standard gurus decide, > > I felt that postgresql currently gives desirable behavior - once a > > transaction is started, it's either all or nothing. But then I > > qualified that by saying I'd like somehow to be able to "sanitize" the > > transaction so that the common idiom above could be made to work. > > > > >From my examination, the difference between our two examples is > > > > Original: > > KD> $rtv = $dbh->do("INSERT INTO tmp VALUES ($1,$2)"); > > > > Modified: > > KM> eval{$rtv = $dbh->do("INSERT INTO tmp VALUES ($1,$2)")}; > > > > >From the point of view if the DBMS, i believe these are identical - in > > both cases the query is issued to the DMBS and the overall transaction > > becomes "contaminated". And as I said before, this is exactly what > > I'd like to have happen in the default case. > > > > It's not that eval's error trapping is blown out - it's that the > > transaction defined by the AutoCommit cannot complete because a part > > of it cannot complete -- that's what atomicity means. > > I don't have the SQL92 standard with me, so I can't speak to how it > defines atomicity. Seems to me it's a means to an end, though, the end > being that all of the statements in the sequence are performed, or > none. But if the program traps an error, then does something to > recover, you could argue that it's changed the sequence. I agree > As long as the program has to explicitly Commit, why not? It seems > desirable to me that if one statement causes an error, it doesn't affect > the database, and the error is returned to the client. If the client > has RaiseError on, which he should, and doesn't do anything to > explicitly trap, it's going to blow out the program and thus the > transaction should be rolled back, which is a good thing. But if he > does explicitly trap, as I do above, why not let him stay within the > transaction, since the statement in error has not done anything? It is not sufficient that the statement in error has done nothing - the postmaster in the general case cannot know what relationships should exist between the non-key data. It is quite possible that not having a record inserted could make the database fundamentally unusable. Of course, in my original example and in yours, error is trapped and the situation is (hopefully) fixed by the subsequent update. Thus, in my post I suggested that postgres could provide some sort of mechanism to explicitly 'sanitize' the transaction and allow it to commit. In otherwords, I think we are basically proposing the same thing. > I agree that do get Postgresql to do this might be a lot to expect > (nested transactions are required, I guess). I'm just not sure that > it's a *wrong*, or non-conformant, thing to expect. > > (By the way, I know VB/Access does it this way. My production code, > however, never takes advantage of this, to my knowledge.) From what I gather, extending postgresql this way is planned anyway - it may not happen tomorrow, but notheing in here seems like a very new concept to the development team. > Addressing Lincoln Yeoh's point in another post, to take the approach > that all your data should conform to all database requirements before > you enter a transaction seems to me to lead to redundancy: the program > code checks and the database checks. Should you have to synchronize all > relevant code every time a field requirement is changed? > > I agree that to simply continue without error and let the program > blindly commit, which some folks claim other databases do, is wrong and > screws atomicity. > > What is also wrong is to allow you to do a Commit when the database is > in an error state, so that you have (in this case) a table in limbo that > can't be created or seen, behavior that Jose Soares and I both saw with > Postgresql (6.5.1 in my case). Why shouldn't Postgresql just implicitly > Rollback at this point, since you can't do anything (constructive) to > the database within the transaction anyway? Yes, the table in limbo is certainly a problem/bug. But even this bug, in my estimation, is better than allowing a transaction with an error in it to commit without explicitly clearing the error status. The bug is a pain in the neck, but it apparently has been fixed in 6.5.3 -- so why not upgrade, no dumps are required. But even with the bug, it can save you from unknowingly foisting inaccurate data on your customers which is still a good thing. As for whether postgress should implicitly roll back, I don't think it should - remember that the frontend, which is very likely operating in robot mode, is still firing queries at the database. An inpmlicit rollback means starting a new transaction. And that could lead to a data integrity problem as well. -- Karl DeBisschop <kdebisschop@alert.infoplease.com> 617.832.0332 (Fax: 617.956.2696) Information Please - your source for FREE online reference http://www.infoplease.com - Your Ultimate Fact Finder http://kids.infoplease.com - The Great Homework Helper Netsaint Plugins Development http://netsaintplug.sourceforge.net
On Fri, 25 Feb 2000, Karl DeBisschop wrote: > > > From: <kaiq@realtyideas.com> > > On Fri, 25 Feb 2000, Karl DeBisschop wrote: > > > > > > > > >>To summarize, I stated that the following does not work with > > > >>postgresql: > > > >> > > > >>> $dbh->{AutoCommit} = 0; > > > >>> $dbh->do("CREATE TABLE tmp (a int unique,b int)"); > > > >>> $rtv = $dbh->do("INSERT INTO tmp VALUES ($1,$2)"); > > > >>> if ($rtv) {$dbh->do("UPDATE tmp SET b=$2 where a=$1")}; > > > >>> $dbh->commit; > > > >>> $dbh->disconnect; > > > >> > > > > > > The usefulness of the idion is that in a mutli-user environment, this > > > is a basic way to update data that may or may not already have a key > > > in the table. You can't do a "SELECT COUNT" because in the time > > > between when you SELECT and INSERT (assuming the key is not already > > > there) someone may have done a separate insert. The only other way I > > > know to do this is to lock the entire table against INSERTs which has > > > obvious performance effects. > > > sounds right, but ;-) why you use the transaction in the first place? > > Rememeber that this is just an example to illustrate what sort of > behaviour one user would find useful in tranasctions, so it is a > little simplistic. Not overly simplistic, though, I think. > > I'd want a transaction because I'm doing a bulk insert into this live > database - say syncing in a bunch of data from a slave server while > the master is still running. If one (or more) insert(s) fail, I want > to revert back to the starting pint so I can fix the cause of the > failed insert and try again with the database in a known state. > (there may, for instance, be relationships beteewn the b field such > that if only part of the bulk insert suceeds, the database is rendered > corrupt). > thanks. I'm on your side now ;-) -- it is a useful senario. the question are: 1) can nested transaction be typically interpreted to handle this situation? If is is, then, it should be handled by that "advanced feature", not plain transaction ; 2) on the other hand, can sql92's (plain) transaction be interpreted in the way that above behavior is legitimate?
kaiq@realtyideas.com wrote: > > On Fri, 25 Feb 2000, Karl DeBisschop wrote: > > > > > > From: <kaiq@realtyideas.com> > > > On Fri, 25 Feb 2000, Karl DeBisschop wrote: > > > > > > > > > > > >>To summarize, I stated that the following does not work with > > > > >>postgresql: > > > > >> > > > > >>> $dbh->{AutoCommit} = 0; > > > > >>> $dbh->do("CREATE TABLE tmp (a int unique,b int)"); > > > > >>> $rtv = $dbh->do("INSERT INTO tmp VALUES ($1,$2)"); > > > > >>> if ($rtv) {$dbh->do("UPDATE tmp SET b=$2 where a=$1")}; > > > > >>> $dbh->commit; > > > > >>> $dbh->disconnect; > > > > >> > > > > > > > > The usefulness of the idion is that in a mutli-user environment, this > > > > is a basic way to update data that may or may not already have a key > > > > in the table. You can't do a "SELECT COUNT" because in the time > > > > between when you SELECT and INSERT (assuming the key is not already > > > > there) someone may have done a separate insert. The only other way I > > > > know to do this is to lock the entire table against INSERTs which has > > > > obvious performance effects. > > > > > sounds right, but ;-) why you use the transaction in the first place? > > > > Rememeber that this is just an example to illustrate what sort of > > behaviour one user would find useful in tranasctions, so it is a > > little simplistic. Not overly simplistic, though, I think. > > > > I'd want a transaction because I'm doing a bulk insert into this live > > database - say syncing in a bunch of data from a slave server while > > the master is still running. If one (or more) insert(s) fail, I want > > to revert back to the starting pint so I can fix the cause of the > > failed insert and try again with the database in a known state. > > (there may, for instance, be relationships beteewn the b field such > > that if only part of the bulk insert suceeds, the database is rendered > > corrupt). > > > thanks. I'm on your side now ;-) -- it is a useful senario. > the question are: 1) can nested transaction be typically interpreted > to handle this situation? If is is, then, it should be handled by that > "advanced feature", not plain transaction ; I guess like this (got rid of AutoCommit, because that looks funny nested): $dbh->RaiseError = 1; $dbh->StartTransaction; eval { $dbh->do("CREATE TABLE tmp (a int unique,b int)"); while (blahblahblah) { $dbh->StartTransaction; eval { $dbh->do("INSERT INTO tmp VALUES ($1,$2)"); }; if ($@) { $dbh->Rollback; {$dbh->do("UPDATE tmp SET b=$2 where a=$1")}; } else { $dbh->Commit; } } } if ($@) { $dbh->rollback; } else { $dbh->commit; } $dbh->disconnect; I.e., try the INSERT within the inner transaction; if it fails, roll it back and do the UPDATE; if that fails, blow out the whole outer transaction. You could do the whole thing checking a return value as in the original example, but the eval and RaiseError are canonical, according the the docs. > 2) on the other hand, can sql92's (plain) transaction be interpreted > in the way that above behavior is legitimate? > Well, I'm not sure of the necessity of nested transactions in the case of continuing a transaction after a single-row insert has failed, but that's implementation details I'm not familiar with... i.e., I'm not having to code the danged thing!