Обсуждение: BUG #5989: Assertion failure on UPDATE of big value
The following bug has been logged online: Bug reference: 5989 Logged by: Marko Tiikkaja Email address: marko.tiikkaja@2ndquadrant.com PostgreSQL version: git master Operating system: Linux Description: Assertion failure on UPDATE of big value Details: Test case: =# create table foo(a int[]); CREATE TABLE =# insert into foo select array(select i from generate_series(1,10000) i); INSERT 0 1 =# update foo set a = a||1; TRAP: FailedAssertion("!(((bool) (((void*)(&(newTuple->t_self)) != ((void *)0)) && ((&(newTuple->t_self))->ip_posid != 0))))", File: "predicate.c", Line: 2282) The assertion only seems to trigger if the array is big enough to be TOASTed, but I didn't debug further to see if that's really the case.
"Marko Tiikkaja" <marko.tiikkaja@2ndquadrant.com> writes: > =# create table foo(a int[]); > CREATE TABLE > =# insert into foo select array(select i from generate_series(1,10000) i); > INSERT 0 1 > =# update foo set a = a||1; > TRAP: FailedAssertion("!(((bool) (((void*)(&(newTuple->t_self)) != ((void > *)0)) && ((&(newTuple->t_self))->ip_posid != 0))))", File: "predicate.c", > Line: 2282) Reproduced here. Why is predicate.c getting called at all when transaction_isolation is not SERIALIZABLE? (Although the same crash happens when it is ...) regards, tom lane
On 20.04.2011 17:26, Tom Lane wrote: > "Marko Tiikkaja"<marko.tiikkaja@2ndquadrant.com> writes: >> =# create table foo(a int[]); >> CREATE TABLE > >> =# insert into foo select array(select i from generate_series(1,10000) i); >> INSERT 0 1 > >> =# update foo set a = a||1; > >> TRAP: FailedAssertion("!(((bool) (((void*)(&(newTuple->t_self)) != ((void >> *)0))&& ((&(newTuple->t_self))->ip_posid != 0))))", File: "predicate.c", >> Line: 2282) > > Reproduced here. > Why is predicate.c getting called at all when transaction_isolation is > not SERIALIZABLE? (Although the same crash happens when it is ...) Because another serializable transaction that reads/updates the tuple later needs to find the predicate lock acquired by the first transaction, even if the transaction in the middle isn't serializable. It's unfortunate because it imposes a performance penalty to updates even if serializable transactions are not used. I don't think we ever measured the impact of this. :-( -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: > On 20.04.2011 17:26, Tom Lane wrote: >> Why is predicate.c getting called at all when transaction_isolation is >> not SERIALIZABLE? (Although the same crash happens when it is ...) > Because another serializable transaction that reads/updates the tuple > later needs to find the predicate lock acquired by the first > transaction, even if the transaction in the middle isn't serializable. Sorry, that argument doesn't pass the sniff test. If the transaction in the middle isn't serializable, then it is not the same as the "first transaction", which means that the first transaction is completed and can no longer be holding any locks. > It's unfortunate because it imposes a performance penalty to updates > even if serializable transactions are not used. I don't think we ever > measured the impact of this. :-( This feature was sold to us on the grounds that it wouldn't impose any performance penalty when not in use. Not having bothered to measure the performance penalty isn't passing the sniff test either. regards, tom lane
On 20.04.2011 17:51, Tom Lane wrote: > Heikki Linnakangas<heikki.linnakangas@enterprisedb.com> writes: >> On 20.04.2011 17:26, Tom Lane wrote: >>> Why is predicate.c getting called at all when transaction_isolation is >>> not SERIALIZABLE? (Although the same crash happens when it is ...) > >> Because another serializable transaction that reads/updates the tuple >> later needs to find the predicate lock acquired by the first >> transaction, even if the transaction in the middle isn't serializable. > > Sorry, that argument doesn't pass the sniff test. If the transaction in > the middle isn't serializable, then it is not the same as the "first > transaction", which means that the first transaction is completed and > can no longer be holding any locks. There's *three* transactions here. The first one is serializable, and reads the tuple. The second one is not serializable, and updates it. The third one is serializable and updates it again. The second transaction needs to copy the predicate lock held by the first transaction to the new row version, so that the third transaction that updates it again sees the lock. Or, we document that any non-serializable updates will break the serialization protection for any other transactions accessing the same rows. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On 20.04.2011 17:51, Tom Lane wrote: > ... which means that the first transaction is completed and > can no longer be holding any locks. Predicate locks are held after transaction commit. They can only be cleaned once the xid is older than the oldest xmin among all active serializable transactions. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
"Marko Tiikkaja" <marko.tiikkaja@2ndquadrant.com> wrote: > TRAP: FailedAssertion("!(((bool) (((void*)(&(newTuple->t_self)) != > ((void *)0)) && ((&(newTuple->t_self))->ip_posid != 0))))", File: > "predicate.c", Line: 2282) Am investigating, and have alerted Dan, in case he hasn't noticed this thread. -Kevin
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: > There's *three* transactions here. The first one is serializable, and > reads the tuple. The second one is not serializable, and updates it. The > third one is serializable and updates it again. > The second transaction needs to copy the predicate lock held by the > first transaction to the new row version, so that the third transaction > that updates it again sees the lock. > Or, we document that any non-serializable updates will break the > serialization protection for any other transactions accessing the same rows. Hmm. I wonder whether we need to have some sort of system-wide enable flag for this. Unless you can show that the overhead is negligible (and not having bothered to measure it is definitely not sufficient...) I'm not excited about dragging down the performance of the entire database on the off chance that somebody somewhere might be using SSI. regards, tom lane
On 20.04.2011 18:08, Kevin Grittner wrote: > "Marko Tiikkaja"<marko.tiikkaja@2ndquadrant.com> wrote: > >> TRAP: FailedAssertion("!(((bool) (((void*)(&(newTuple->t_self)) != >> ((void *)0))&& ((&(newTuple->t_self))->ip_posid != 0))))", File: >> "predicate.c", Line: 2282) > > Am investigating, and have alerted Dan, in case he hasn't noticed > this thread. The immediate fix is trivial: --- a/src/backend/access/heap/heapam.c +++ b/src/backend/access/heap/heapam.c @@ -2859,7 +2859,7 @@ l2: * Any existing SIREAD locks on the old tuple must be linked to the new * tuple for conflict detection purposes. */ - PredicateLockTupleRowVersionLink(relation, &oldtup, newtup); + PredicateLockTupleRowVersionLink(relation, &oldtup, heaptup); if (newbuf != buffer) LockBuffer(newbuf, BUFFER_LOCK_UNLOCK); But the question Tom raised about doing this even for non-serializable transactions is more serious.. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: > The immediate fix is trivial: > --- a/src/backend/access/heap/heapam.c > +++ b/src/backend/access/heap/heapam.c > @@ -2859,7 +2859,7 @@ l2: > * Any existing SIREAD locks on the old tuple must be linked to > the new > * tuple for conflict detection purposes. > */ > - PredicateLockTupleRowVersionLink(relation, &oldtup, newtup); > + PredicateLockTupleRowVersionLink(relation, &oldtup, heaptup); Egad. If that's it, my confidence in the amount of testing SSI has gotten has just dropped dramatically. regards, tom lane
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > On 20.04.2011 17:26, Tom Lane wrote: >> Why is predicate.c getting called at all when >> transaction_isolation is not SERIALIZABLE? (Although the same >> crash happens when it is ...) > > Because another serializable transaction that reads/updates the > tuple later needs to find the predicate lock acquired by the > first transaction, even if the transaction in the middle isn't > serializable. For an example of the issue, see this post: http://archives.postgresql.org/pgsql-hackers/2011-02/msg00325.php While all transactions there are serializable, the issue exists even if the transaction which updates the tuple with the predicate lock isn't serializable. -Kevin
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > But the question Tom raised about doing this even for > non-serializable transactions is more serious.. This particular call from heapam.c went in much later than most of the code, since we had trouble proving that anything needed to be done there in the first place. As such, it wasn't there during the many benchmarks which were done along the way. None of those showed any performance degradation for other isolation levels due to the SSI code. Dan and I are sorting out the best way to isolate this particular call to assess the performance impact. We also think there should probably be an additional test or two in the regression tests to exercise this area. I think we should wait for the results of performance tests and (if warranted by benchmark results) profiling before we start talking about possible ways to address this. The evils of premature optimization and all. Dan came to the same conclusion a fix before seeing Heikki's post on the issue, so it would be good get that committed. More to follow, once we have benchmark results on this and a suggested patch for regression test coverage. -Kevin
Tom Lane <tgl@sss.pgh.pa.us> wrote: > Egad. If that's it, my confidence in the amount of testing SSI > has gotten has just dropped dramatically. If I'm reading this correctly, it would appear that nobody has updated anything to a TOASTed value in a build against HEAD in testing *anything* in the last two and a half months. And the regression tests don't include a single UPDATE to a TOASTed value anywhere. That seems like a significant code coverage deficiency. Attached is a patch to cure the latter of these. I'm submitting this separately since it seems a good idea regardless of what happens with the related SSI issue. It is, of course, no excuse for making a dumb mistake like that, but it wouldn't have survived the day on my machine or Dan's, much less been submitted as a patch or committed, with the attached general test of update-to-TOAST functionality in the regression tests. -Kevin
Вложения
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > If I'm reading this correctly, it would appear that nobody has > updated anything to a TOASTed value in a build against HEAD in > testing *anything* in the last two and a half months. And the > regression tests don't include a single UPDATE to a TOASTed value > anywhere. That seems like a significant code coverage deficiency. > Attached is a patch to cure the latter of these. I'm submitting > this separately since it seems a good idea regardless of what > happens with the related SSI issue. Hmm, is there much regression coverage for TOAST inserts or deletes either? regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote: > Hmm, is there much regression coverage for TOAST inserts or > deletes either? I'll check. -Kevin
Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: >> If I'm reading this correctly, it would appear that nobody has >> updated anything to a TOASTed value in a build against HEAD in >> testing *anything* in the last two and a half months. And the >> regression tests don't include a single UPDATE to a TOASTed value >> anywhere. That seems like a significant code coverage >> deficiency. > >> Attached is a patch to cure the latter of these. I'm submitting >> this separately since it seems a good idea regardless of what >> happens with the related SSI issue. > > Hmm, is there much regression coverage for TOAST inserts or > deletes either? There didn't appear to be. The attached provides minimal testing of user-facing behavior of TOAST insert, update, and delete. It's pretty basic, but a lot better than having no tests for this at all. I can't help feeling that there should be a toast.sql test file which reaches deeper, but I'm not exactly sure how far that would go. -Kevin
Вложения
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > The immediate fix is trivial: > > [patch changing one line of code] I have confirmed that without this patch the new regression tests I have proposed will fail, and with the patch they succeed. I have also confirmed that the isolation test suite continues to succeed with this patch applied. -Kevin
On Wed, Apr 20, 2011 at 5:16 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > >> The immediate fix is trivial: >> >> [patch changing one line of code] > > I have confirmed that without this patch the new regression tests I > have proposed will fail, and with the patch they succeed. =A0I have > also confirmed that the isolation test suite continues to succeed > with this patch applied. I committed the one-line fix, and the additional regression tests. --=20 Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company