SetQuerySnapshot, once again
От | Tom Lane |
---|---|
Тема | SetQuerySnapshot, once again |
Дата | |
Msg-id | 11884.1024271595@sss.pgh.pa.us обсуждение исходный текст |
Список | pgsql-hackers |
I've been busy working on my presentation on concurrency for the upcoming O'Reilly conference. While doing so, I've been thinking more about the question of when to do SetQuerySnapshot calls inside functions. We've gone around on that before, without much of a consensus on what to do; see for example the thread starting at http://fts.postgresql.org/db/mw/msg.html?mid=1029236 I have now become convinced that it is correct, in fact necessary, to do SetQuerySnapshot for each new user-supplied query, whether it's inside a function or not. A CommandCounterIncrement without an associated SetQuerySnapshot is okay internally within system utility operations (eg, to make visible a catalog entry we just created), but it is highly suspect otherwise. In serializable mode, SetQuerySnapshots after the first one of a transaction are no-ops, so there's really no difference in that case. All we need to think about is read-committed mode. And in read-committed mode, we can have situations like this: UPDATE webpages SET hits = hits + 1 WHERE url = '...';SELECT hits FROM webpages WHERE url = '...'; If there are no concurrent updates going on, this will work as expected: the SELECT will see the updated row. But if there are concurrent updates and we do not do SetQuerySnapshots in plpgsql, then the SELECT may see two versions of the target row as valid: both the one that was valid as of the last SetQuerySnapshot before we entered the function, and the one created by the UPDATE. This happens if and only if some other client updated the same row and committed after the last SetQuerySnapshot. The UPDATE will see that other client's row as current and will update it, as expected. But then the SELECT will consider the previous version of the row to be still good, because it was after all deleted by a transaction that committed later than the query snapshot! And the version produced by the UPDATE is good too, since it was produced within the current transaction (and we've done CommandCounterIncrement to make it visible). An example of exactly this misbehavior can be seen in http://archives.postgresql.org/pgsql-bugs/2002-02/msg00142.php Particularly in 7.2, it's a tossup which version of the row will be found first by the SELECT, so the bug might appear and disappear depending on the phase of the moon, making it even worse. We get sensible behavior in the normal interactive case *only* because there will be a SetQuerySnapshot between UPDATE and SELECT, and so the SELECT will certainly consider any versions seen as obsolete by UPDATE to be obsolete also. So I've come around to agree with the position that Tatsuo and Hiroshi put forward in the thread mentioned above: plpgsql (and the other PL languages) need to do SetQuerySnapshot not only CommandCounterIncrement between user-supplied queries. Is anyone still unconvinced? If not, I'll try to fix it sometime soon. As that thread pointed out, there also seem to be some problems with plpgsql not doing enough CommandCounterIncrements when it's executing already-planned queries; I'll take a look at that issue at the same time. regards, tom lane
В списке pgsql-hackers по дате отправления: