Re: Serializable Snapshot Isolation
От | Kevin Grittner |
---|---|
Тема | Re: Serializable Snapshot Isolation |
Дата | |
Msg-id | 4C9B27220200002500035BE9@gw.wicourts.gov обсуждение исходный текст |
Ответ на | Re: Serializable Snapshot Isolation (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>) |
Ответы |
Re: Serializable Snapshot Isolation
Re: Serializable Snapshot Isolation |
Список | pgsql-hackers |
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > On 23/09/10 02:14, Kevin Grittner wrote: >> There is a rub on the other point, though. Without transaction >> information you have no way of telling whether TN committed >> before T0, so you would need to assume that it did. So on this >> count, there is bound to be some increase in false positives >> leading to transaction rollback. Without more study, and maybe >> some tests, I'm not sure how significant it is. (Actually, we >> might want to track commit sequence somehow, so we can determine >> this with greater accuracy.) > > I'm confused. AFAICS there is no way to tell if TN committed > before T0 in the current patch either. Well, we can certainly infer it if the finishedBefore values differ. And, as I said, if we don't eliminate this structure for committed transactions, we could add a commitId or some such, with "precedes" and "follows" tests similar to TransactionId. >> The other way we can detect conflicts is a read by a serializable >> transaction noticing that a different and overlapping >> serializable transaction wrote the tuple we're trying to read. >> How do you propose to know that the other transaction was >> serializable without keeping the SERIALIZABLEXACT information? > > Hmm, I see. We could record which transactions were serializable > in a new clog-like structure that wouldn't exhaust shared memory. > >> And how do you propose to record the conflict without it? > > I thought you just abort the transaction that would cause the > conflict right there. The other transaction is committed already, > so you can't do anything about it anymore. No, it always requires a rw-conflict from T0 to T1 and a rw-conflict from T1 to TN, as well as TN committing first and (T0 not being READ ONLY or TN not overlapping T0). The number and complexity of the conditions which must be met to cause a serialization failure are what keep the failure rate reasonable. If we start rolling back transactions every time one transaction simply reads a row modified by a concurrent transaction I suspect that we'd have such a storm of serialization failures in most workloads that nobody would want to use it. >> Finally, this would preclude some optimizations which I *think* >> will pay off, which trade a few hundred kB more of shared memory, >> and some additional CPU to maintain more detailed conflict data, >> for a lower false positive rate -- meaning fewer transactions >> rolled back for hard-to-explain reasons. This more detailed >> information is also what seems to be desired by Dan S (on another >> thread) to be able to log the information needed to be able to >> reduce rollbacks. > > Ok, I think I'm ready to hear about those optimizations now :-). Dan Ports is eager to implement "next key" predicate locking for indexes, but wants more benchmarks to confirm the benefit. (Most of the remaining potential optimizations carry some risk of being counter-productive, so we want to go in with something conservative and justify each optimization separately.) That one only affects your proposal to the extent that the chance to consolidate locks on the same target by committed transactions would likely have fewer matches to collapse. One that I find interesting is the idea that we could set a SERIALIZABLE READ ONLY transaction with some additional property (perhaps DEFERRED or DEFERRABLE) which would cause it to take a snapshot and then wait until there were no overlapping serializable transactions which are not READ ONLY which overlap a running SERIALIZABLE transaction which is not READ ONLY. At this point it could make a valid snapshot which would allow it to run without taking predicate locks or checking for conflicts. It would have no chance of being rolled back with a serialization failure *or* of contributing to the failure of any other transaction, yet it would be guaranteed to see a view of the database consistent with the actions of all other serializable transactions. One place I'm particularly interested in using such a feature is in pg_dump. Without it we have the choice of using a SERIALIZABLE transaction, which might fail or cause failures (which doesn't seem good for a backup program) or using REPEATABLE READ (to get current snapshot isolation behavior), which might capture a view of the data which contains serialization anomalies. The notion of capturing a backup which doesn't comply with business rules enforced by serializable transactions gives me the willies, but it would be better than not getting a backup reliably, so in the absence of this feature, I think we need to change pg_dump to use REPEATABLE READ. I can't see how to do this without keeping information on committed transactions. This next paragraph is copied straight from the Wiki page: It appears that when a pivot is formed where T0 is a flagged as a READ ONLY transaction, and it is concurrent with TN, we can wait to see whether anything really needs to roll back. If T1 commits before developing a rw-dependency to another transaction with a commit early enough to make it visible to T0, the rw-dependency between T0 and T1 can be removed or ignored. It might even be worthwhile to track whether a serializable transaction *has* written to any permanent table, so that this optimization can be applied to de facto READ ONLY transactions (i.e., not flagged as such, but not having done any writes). Again, copying from the Wiki "for the record" here: It seems that we could guarantee that the retry of a transaction rolled back due to a dangerous structure could never immediately roll back on the very same conflicts if we always ensure that there is a successful commit of one of the participating transactions before we roll back. Is it worth it? It seems like it might be, because it would ensure that some progress is being made and prevent the possibility of endless flailing on any set of transactions. We could be sure of this if we: * use lists for inConflict and outConflict * never roll back until we have a pivot with a commitof the transaction on the "out" side * never roll back the transaction being committed in the PreCommit check *have some way to cause another, potentially idle, transaction to roll back with a serialization failure SQLSTATE I'm afraid this would further boost shared memory usage, but the payoff may well be worth it. At one point I did some "back of an envelope" calculations, and I think I found that with 200 connections an additional 640kB of shared memory would allow this. On top of the above optimization, just having the lists would allow more precise recognition of dangerous structures in heavy load, leading to fewer false positives even before you get to the above. Right now, if you have two conflicts with different transactions in the same direction it collapses to a self-reference, which precludes use of optimizations involving TN committing first or T0 being READ ONLY. Also, if we go to these lists, I think we can provide more of the information Dan S. has been requesting for the error detail. We could list all transactions which participated in any failure and I *think* we could show the statement which triggered the failure with confidence that some relation accessed by that statement was involved in the conflicts leading to the failure. Less important than any of the above, but still significant in my book, I fear that conflict recording and dangerous structure detection could become very convoluted and fragile if we eliminate this structure for committed transactions. Conflicts among specific sets of transactions are the linchpin of this whole approach, and I think that without an object to represent each one for the duration for which it is significant is dangerous. Inferring information from a variety of sources "feels" wrong to me. -Kevin
В списке pgsql-hackers по дате отправления: