Re: Deriving Recovery Snapshots
| От | Simon Riggs | 
|---|---|
| Тема | Re: Deriving Recovery Snapshots | 
| Дата | |
| Msg-id | 1224103863.3808.224.camel@ebony.2ndQuadrant обсуждение исходный текст | 
| Ответ на | Re: Deriving Recovery Snapshots (Jeff Davis <pgsql@j-davis.com>) | 
| Список | pgsql-hackers | 
On Wed, 2008-10-15 at 12:58 -0700, Jeff Davis wrote: > On Tue, 2008-10-14 at 18:50 +0100, Simon Riggs wrote: > > I've worked out what I think is a workable, efficient process for > > deriving snapshots during recovery. I will be posting a patch to show > > how this works tomorrow [Wed 15 Oct], just doing cleanup now. > > How will this interact with an idea like this?: > http://archives.postgresql.org/pgsql-hackers/2008-01/msg00400.php pg_snapclone should work fine, since it is orthogonal to this work. > > I've had to change the way XidInMVCCSnapshot() works. We search the > > snapshot even if it has overflowed. This is actually a performance win > > in cases where only a few xids have overflowed but most haven't. This is > > essential because if we were forced to check in subtrans *and* > > unobservedxids existed then the snapshot would be invalid. (I could have > > made it this way *just* in recovery, but the change seems better both > > ways). > > I don't entirely understand this. Can you explain the situation that > would result in an invalid snapshot? In recovery the snapshot consists of two sets of xids: * ones we have seen as running e.g. xid=43 * ones we know exist, but haven't seen yet (e.g. xid=42) (I call this latter kind Unobserved Transactions). Both kinds of xids *must* be in the snapshot for MVCC to work. The current way of checking snapshots is to say "if *any* of the running transactions has overflowed, check subtrans". Unobserved transactions are not in subtrans, so if you checked for them there you would fail to find them. Currently we assume that means it is a top-level transaction and then check the top-level xids. Why are unobserved transactions not in subtrans? Because they are unobserved, so we can't assign their parent xid. (By definition, because they are unobserved). There isn't always enough space in the snapshot to allow all the unobserved xids to be added as if they were top-level transactions, so we put them into the subtrans cache as a secondary location and then change the algorithm in XidInMVCCSnapshot(). We don't want to increase the size of the snapshot because it already contains wasted space in subtrans cache, nor do we wish to throw errors when people try to take snapshots. The XidInMVCCSnapshot() changes make sense of themselves for most cases, since we don't want one transaction to cause us to thrash subtrans, as happened in 8.1. This took me some time to think through... -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
В списке pgsql-hackers по дате отправления: