Обсуждение: SSI work for 9.1
From a review of recent emails I've put together a list of what I'm going to try to do this evening, in order of attack. It's ambitious and I may well not get to the end tonight, but I wanted to get the issues on record in list form. If someone spots one I'm missing or thinks I should change the order I go at them, please say something. Based on the feedback, I suspect that (1) is a beta2 blocker. I don't think the rest are, although it would sure be nice to get them in now. If anyone else wants to jump on something on this list, let me know -- I'm happy to share. (1) Pass snapshot in to some predicate.c functions. The particular functions have yet to be determined, but certainly any which acquire predicate locks, and probably all which are guarded by the SkipSerialization() macro. Skip processing for non-MVCC snapshots. The goal here is to reduce false positive serialization failures and avoid confusion about locks showing in the pg_locks view which are hard to explain. (2) Check on heap truncation from vacuum. On the face of it this seems unlikely to be a problem since we make every effort to clean up predicate locks as soon as there is no transaction which can update what a transaction has read, but it merits a re-check. Once confirmed, add a note to lazy_truncate_heap about why it's not an issue. (3) Add regression tests for SSI/DDL interaction. (4) Add a comment to the docs about how querying tuples by TID doesn't lock not-found "gaps" the way an indexed access would. (5) I've also had a note to myself to add a couple more bullets to the performance notes at the bottom of this section: http://developer.postgresql.org/pgdocs/postgres/transaction-iso.html#XACT-SERIALIZABLE- If you're having a lot of rollbacksdue to table-scan plans being used, you might want to try reducing random_page_cost and/or boosting cpu_tuple_cost.- If you're having a lot of rollbacks because multiple locks are being combined into relation locks, you might want to increase max_pred_locks_per_transaction. -Kevin
On Wed, Jun 08, 2011 at 05:48:26PM -0500, Kevin Grittner wrote: > (1) Pass snapshot in to some predicate.c functions. The particular > functions have yet to be determined, but certainly any which acquire > predicate locks, and probably all which are guarded by the > SkipSerialization() macro. Skip processing for non-MVCC snapshots. > The goal here is to reduce false positive serialization failures and > avoid confusion about locks showing in the pg_locks view which are > hard to explain. I assume you've already started on this one; let me know if you have a patch I should take a look at or hit any snags. > (2) Check on heap truncation from vacuum. On the face of it this > seems unlikely to be a problem since we make every effort to clean > up predicate locks as soon as there is no transaction which can > update what a transaction has read, but it merits a re-check. Once > confirmed, add a note to lazy_truncate_heap about why it's not an > issue. I assume you are worried here that there may be SIREAD locks remaining on truncated pages/tuples, and these would cause false positives if those pages are reused? I don't believe this can happen, because a vacuum will only delete a formerly-visible dead tuple if its xmax is earlier than OldestXmin. We remove all SIREAD locks on anything older than GlobalSxactXmin, which won't be less than OldestXmin. > (4) Add a comment to the docs about how querying tuples by TID > doesn't lock not-found "gaps" the way an indexed access would. I can take care of this one and some other README-SSI changes. Dan -- Dan R. K. Ports MIT CSAIL http://drkp.net/
> Dan Ports wrote: > On Wed, Jun 08, 2011 at 05:48:26PM -0500, Kevin Grittner wrote: >> (1) Pass snapshot in to some predicate.c functions. The particular >> functions have yet to be determined, but certainly any which >> acquire predicate locks, and probably all which are guarded by the >> SkipSerialization() macro. Skip processing for non-MVCC snapshots. >> The goal here is to reduce false positive serialization failures >> and avoid confusion about locks showing in the pg_locks view which >> are hard to explain. > > I assume you've already started on this one; let me know if you > have a patch I should take a look at or hit any snags. A patch is attached which just covers the predicate lock acquisition, where a snapshot is available without too much pain. There are two functions which acquire predicate locks where a snapshot was not readily available: _bt_search() and _bt_get_endpoint(). Not only was it not clear how to get a snapshot in, it was not entirely clear from reading the code that we need to acquire predicate locks here. Now, I suspect that we probably do, because I spent many long hours stepping through gdb to pick the spots where they are, but that was about a year ago and my memory of the details has faded. FWIW, this patch not only passes the usual battery of regression tests that I run, but the test which was showing REINDEX acquiring predicate locks on the heap runs without that happening. I'm posting this because I think it's the most important area to cover, and in a pinch might satisfy people for 9.1; but more importantly to give me a good place to step back to in case I muck things up moving forward from this point. >> (2) Check on heap truncation from vacuum. On the face of it this >> seems unlikely to be a problem since we make every effort to clean >> up predicate locks as soon as there is no transaction which can >> update what a transaction has read, but it merits a re-check. Once >> confirmed, add a note to lazy_truncate_heap about why it's not an >> issue. > > I assume you are worried here that there may be SIREAD locks > remaining on truncated pages/tuples, and these would cause false > positives if those pages are reused? > > I don't believe this can happen, because a vacuum will only delete > a formerly-visible dead tuple if its xmax is earlier than > OldestXmin. We remove all SIREAD locks on anything older than > GlobalSxactXmin, which won't be less than OldestXmin. That was my first thought, too. But, the question being raised, I thought a quick double-check that there weren't any corner cases where this could occur was reasonable. >> (4) Add a comment to the docs about how querying tuples by TID >> doesn't lock not-found "gaps" the way an indexed access would. > > I can take care of this one and some other README-SSI changes. OK, I'll stay away from any doc items for now. Those are all yours until we agree otherwise. :-) I really don't think I'm going to get past the snapshot guard issue tonight. :-/ -Kevin
Вложения
"Kevin Grittner" wrote: > A patch is attached which just covers the predicate lock > acquisition This patch rolls that up with snapshot checking in the conflict detection function called on read. The only other two functions which use that macro check for conflicts on write, and I can't see why snapshot checking would make sense there. The write isn't into a particular snapshot, it's in the context of the transaction; so unless someone can make a good case for why snapshot checking makes sense there, I think this is far as it makes sense to take this. Except of course for those two functions in the btree AM which didn't have snapshot available. I'm not sure what needs to be done there. Thoughts? -Kevin
Вложения
On Wed, Jun 08, 2011 at 09:17:04PM -0500, Kevin Grittner wrote: > A patch is attached which just covers the predicate lock acquisition, > where a snapshot is available without too much pain. There are two > functions which acquire predicate locks where a snapshot was not > readily available: _bt_search() and _bt_get_endpoint(). Not only was > it not clear how to get a snapshot in, it was not entirely clear from > reading the code that we need to acquire predicate locks here. Now, > I suspect that we probably do, because I spent many long hours > stepping through gdb to pick the spots where they are, but that was > about a year ago and my memory of the details has faded. For _bt_search(), the lock calls should move to _bt_first() where the ScanDesc is available. This also keeps us from trying to take locks during _bt_pagedel(), which is only called during vacuum and recovery. The call in _bt_get_endpoint() seems unnecessary, because after it returns, _bt_endpoint() takes the same lock. The only other callers of _bt_get_endpoint() are _bt_pagedel() and _bt_insert_parent(), neither of which should take predicate locks. I've updated the patch, attached. Dan -- Dan R. K. Ports MIT CSAIL http://drkp.net/
Вложения
Dan Ports wrote: > On Wed, Jun 08, 2011 at 09:17:04PM -0500, Kevin Grittner wrote: >> A patch is attached which just covers the predicate lock >> acquisition, where a snapshot is available without too much pain. >> There are two functions which acquire predicate locks where a >> snapshot was not readily available: _bt_search() and >> _bt_get_endpoint(). Not only was it not clear how to get a >> snapshot in, it was not entirely clear from reading the code that >> we need to acquire predicate locks here. Now, I suspect that we >> probably do, because I spent many long hours stepping through gdb >> to pick the spots where they are, but that was about a year ago >> and my memory of the details has faded. > > For _bt_search(), the lock calls should move to _bt_first() where > the ScanDesc is available. This also keeps us from trying to take > locks during _bt_pagedel(), which is only called during vacuum and > recovery. Sounds reasonable, but why did you pass the snapshot to the PredicateLockPage() call but not the PredicateLockRelation() call? Oversight? > The call in _bt_get_endpoint() seems unnecessary, because after it > returns, _bt_endpoint() takes the same lock. The only other callers > of _bt_get_endpoint() are _bt_pagedel() and _bt_insert_parent(), > neither of which should take predicate locks. That also sounds reasonable. > I've updated the patch, attached. I've confirmed that it passes the usual regression tests (including isolation tests and the normal regression tests at serializable). I'll take a closer look once I wake up and get the caffeine going. Thanks for following up on this! -Kevin
On Thu, Jun 09, 2011 at 07:06:18AM -0500, Kevin Grittner wrote: > Sounds reasonable, but why did you pass the snapshot to the > PredicateLockPage() call but not the PredicateLockRelation() call? > Oversight? Yep, just an oversight; long day yesterday. I'll fix the patch shortly (unless you can get to it first, in which case I wouldn't complain) Dan -- Dan R. K. Ports MIT CSAIL http://drkp.net/
On Thu, Jun 09, 2011 at 01:30:27PM -0400, Dan Ports wrote: > On Thu, Jun 09, 2011 at 07:06:18AM -0500, Kevin Grittner wrote: > > Sounds reasonable, but why did you pass the snapshot to the > > PredicateLockPage() call but not the PredicateLockRelation() call? > > Oversight? > > Yep, just an oversight; long day yesterday. I'll fix the patch shortly Attached. Only change is the missing snapshot argument to that one call. Dan -- Dan R. K. Ports MIT CSAIL http://drkp.net/
Вложения
Dan Ports <drkp@csail.mit.edu> wrote: > On Thu, Jun 09, 2011 at 01:30:27PM -0400, Dan Ports wrote: >> On Thu, Jun 09, 2011 at 07:06:18AM -0500, Kevin Grittner wrote: >>> Sounds reasonable, but why did you pass the snapshot to the >>> PredicateLockPage() call but not the PredicateLockRelation() >>> call? Oversight? >> >> Yep, just an oversight; long day yesterday. I'll fix the patch >> shortly > > Attached. Only change is the missing snapshot argument to that one > call. I am in full agreement with this patch. It's an interesting coincidence that the two predicate locking calls which were at the wrong level to have access to the snapshot information were also at the wrong level to be able to fire them at only the needed times. -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> wrote: > I am in full agreement with this patch. I found that pgindent would like to tweak whitespace in three places in that patch, and I found an unnecessary include that I would like to remove. Normally, I would post a new version of the patch with those adjustments, but there's been a disquieting tendency for people to not look at what's actually happening with revisions of this patch and act like the sky is falling with each refinement. Let me be clear: each posted version of this patch has been safe and has been an improvement on what came before. Dan and I didn't disagree about what to do at any point; Dan figured out what to do with two calls I left alone because I faded before I could work out how to deal with them. Essentially we collaborated on-list rather than discussing things off-list and posting the end result. Perhaps that was a bad choice, but time was short and I had hopes that a change people had requested could be included in beta2. Here are the tweaks which should be applied after Dan's last version of the patch. If people prefer, I'll post a roll-up including them. http://git.postgresql.org/gitweb?p=users/kgrittn/postgres.git;a=commitdiff;h=0258af4168a130af0d1e52294b248d54715b5f72 http://git.postgresql.org/gitweb?p=users/kgrittn/postgres.git;a=commitdiff;h=bb951bacd9700cdbddb0beba338a63bd301b200d -Kevin
On 10.06.2011 19:05, Kevin Grittner wrote: > I found that pgindent would like to tweak whitespace in three places > in that patch, and I found an unnecessary include that I would like > to remove. Normally, I would post a new version of the patch with > those adjustments, but there's been a disquieting tendency for > people to not look at what's actually happening with revisions of > this patch and act like the sky is falling with each refinement. > > Let me be clear: each posted version of this patch has been safe and > has been an improvement on what came before. Dan and I didn't > disagree about what to do at any point; Dan figured out what to do > with two calls I left alone because I faded before I could work out > how to deal with them. Essentially we collaborated on-list rather > than discussing things off-list and posting the end result. Perhaps > that was a bad choice, but time was short and I had hopes that a > change people had requested could be included in beta2. I did some further changes, refactoring SkipSerialization so that it's hopefully more readable, and added a comment about the side-effects. See attached. Let me know if I'm missing something. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Вложения
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > I did some further changes, refactoring SkipSerialization so that > it's hopefully more readable, and added a comment about the > side-effects. See attached. Let me know if I'm missing something. I do think the changes improve readability. I don't see anything missing, but there's something we can drop. Now that you've split the read and write tests, this part can be dropped from the SerializationNeededForWrite function: + + /* Check if we have just become "RO-safe". */ + if (SxactIsROSafe(MySerializableXact)) + { + ReleasePredicateLocks(false); + return false; + } If it's doing a write, it can't be a read-only transaction.... -Kevin
On 14.06.2011 17:57, Kevin Grittner wrote: > Heikki Linnakangas<heikki.linnakangas@enterprisedb.com> wrote: > >> I did some further changes, refactoring SkipSerialization so that >> it's hopefully more readable, and added a comment about the >> side-effects. See attached. Let me know if I'm missing something. > > I do think the changes improve readability. I don't see anything > missing, but there's something we can drop. Now that you've split > the read and write tests, this part can be dropped from the > SerializationNeededForWrite function: > > + > + /* Check if we have just become "RO-safe". */ > + if (SxactIsROSafe(MySerializableXact)) > + { > + ReleasePredicateLocks(false); > + return false; > + } > > If it's doing a write, it can't be a read-only transaction.... Ah, good point. Committed with that removed. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On Wed, Jun 15, 2011 at 5:10 AM, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > On 14.06.2011 17:57, Kevin Grittner wrote: >> >> Heikki Linnakangas<heikki.linnakangas@enterprisedb.com> wrote: >> >>> I did some further changes, refactoring SkipSerialization so that >>> it's hopefully more readable, and added a comment about the >>> side-effects. See attached. Let me know if I'm missing something. >> >> I do think the changes improve readability. I don't see anything >> missing, but there's something we can drop. Now that you've split >> the read and write tests, this part can be dropped from the >> SerializationNeededForWrite function: >> >> + >> + /* Check if we have just become "RO-safe". */ >> + if (SxactIsROSafe(MySerializableXact)) >> + { >> + ReleasePredicateLocks(false); >> + return false; >> + } >> >> If it's doing a write, it can't be a read-only transaction.... > > Ah, good point. Committed with that removed. Does this mean that the open item "more SSI loose ends" can now be marked resolved? http://wiki.postgresql.org/wiki/PostgreSQL_9.1_Open_Items -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Jun 16, 2011 at 11:49:48PM -0400, Robert Haas wrote: > Does this mean that the open item "more SSI loose ends" can now be > marked resolved? I was just looking at it and contemplating moving it to the non-blockers list. Of the five items:- (1) and (4) are resolved- (2) isn't an issue -- maybe we want to add a comment, someplace but I'm not convinced even that is necessary- (3) is a regression test, and is already on the list separately- (5) is a doc issueonly There are no open issues with the code that I'm aware of. Dan -- Dan R. K. Ports MIT CSAIL http://drkp.net/
On Fri, Jun 17, 2011 at 12:30 AM, Dan Ports <drkp@csail.mit.edu> wrote: > On Thu, Jun 16, 2011 at 11:49:48PM -0400, Robert Haas wrote: >> Does this mean that the open item "more SSI loose ends" can now be >> marked resolved? > > I was just looking at it and contemplating moving it to the non-blockers > list. Of the five items: > - (1) and (4) are resolved > - (2) isn't an issue -- maybe we want to add a comment, someplace but > I'm not convinced even that is necessary > - (3) is a regression test, and is already on the list separately > - (5) is a doc issue only > > There are no open issues with the code that I'm aware of. Perhaps it would be best to remove the general item and replace it with a list of more specific things that need doing - which might just mean #5. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, Jun 17, 2011 at 12:32:46AM -0400, Robert Haas wrote: > Perhaps it would be best to remove the general item and replace it > with a list of more specific things that need doing - which might just > mean #5. Done. -- Dan R. K. Ports MIT CSAIL http://drkp.net/