Re: Transaction ISOLATION LEVEL - have I missed something?
От | Pól Ua Laoínecháin |
---|---|
Тема | Re: Transaction ISOLATION LEVEL - have I missed something? |
Дата | |
Msg-id | CAF4RT5Q9vBJ+qda7b2DhQFOLyzXPLVzXtc_WnTqjhs6K=F2zOw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Transaction ISOLATION LEVEL - have I missed something? ("David G. Johnston" <david.g.johnston@gmail.com>) |
Список | pgsql-novice |
On Thu, 13 May 2021 at 22:02, David G. Johnston <david.g.johnston@gmail.com> wrote: > It does. I see now you are trying to describe snapshots of data available to statements in terms of database timestampsas opposed to referring to transactional boundaries. > I've never seen isolation levels discussed in those terms before and do not know whether it is accurate or missing someimportant nuances. With locks, snapshots, and valid interleaving versus invalid interleaving (for serializable) allcoming into play the notion of timestamp isn't really in scope. But if it helps for basic comprehension then it doeshave merit and indeed the links you describe seem correct at first glance. I was explaining to somebody who doesn't have your (or even my basic) grasp of isolation levels by using TRANSACTION_TIMESTAMP and CLOCK_TIMESTAMP as proxies for transaction ids... Below hopefully is the discussion reframed in terms of transaction ids - hopefully it's correct and demonstrates a complete understanding of the topic in terms of PostgreSQL's implementation of the TRANSACTION ISOLATION paradigm and the distinction between READ COMMITTED (RC) and SERIALIZABLE (S). I have put in material below about txid_current() and how RC statements can see the results of transactions where the transaction id is greater than txid_current(), but that when the transaction is S, it can only see the results of committed transactions < txid_current() and any changes made inside txid_current() itself... I found this helpful in understanding what I was seeing in the various pseudo-colum fields https://www.cybertec-postgresql.com/en/is-update-the-same-as-delete-insert-in-postgresql (I've seen this guy on the general list...) I'll have to read that article in-depth to fully grasp all of the subtleties - but it was good. So, for a translation of a time-based explanation of the difference between RC and S to a more PostgreSQL based tables, fields and tx ids explanation - read on! From a session - 1, I do this: =============================================== Session 1 -> START TRANSACTION; START TRANSACTION Session 1 -> SHOW TRANSACTION ISOLATION LEVEL; transaction_isolation ----------------------- read committed (1 row) Session 1 -> SELECT txid_current(), txid_status(txid_current()), xmin, xmax, cmin, cmax, ctid, x, y FROM t; txid_current | txid_status | xmin | xmax | cmin | cmax | ctid | x | y --------------+-------------+------+------+------+------+-------+---+------- 578 | in progress | 527 | 0 | 0 | 0 | (0,1) | 1 | One 578 | in progress | 528 | 0 | 0 | 0 | (0,2) | 2 | Two 578 | in progress | 552 | 0 | 0 | 0 | (0,3) | 3 | Three (3 rows) ================================================ So, I've started the READ COMMITTED transaction (txn) no. 578 - and in table t, there are three records (tuples) (1, 'One'), (2, 'Two') & (3, 'Three'). Txn 527 inserted the first tuple, txn 528 inserted the second one and the third txn was inserted by txn 529, but I updated it, so essentially, it was last modified by txn 552 as we learned from the cybertec article referenced above. So, now in session 2, I update the table by changing 'Three' to 'Trí' in tuple 3. =================================================== Session 2 -> UPDATE t SET y = 'Trí' WHERE x = 3; UPDATE 1 ==================================================== So, that's automatically committed - but, just to check, again from session 2, I do: ======================================================== Session 2 -> SELECT txid_current(), txid_status(txid_current()), xmin, xmax, cmin, cmax, ctid, x, y FROM t; txid_current | txid_status | xmin | xmax | cmin | cmax | ctid | x | y --------------+-------------+------+------+------+------+-------+---+----- 580 | in progress | 527 | 0 | 0 | 0 | (0,1) | 1 | One 580 | in progress | 528 | 0 | 0 | 0 | (0,2) | 2 | Two 580 | in progress | 579 | 0 | 0 | 0 | (0,4) | 3 | Trí (3 rows) =========================================================== So, the txid_current has been incremented - to 579 by the UPDATE and then by the SELECT here in session 2 to 580. Still in session 2, I run this query: ====================================================== SELECT pid, age(clock_timestamp(), query_start), usename, substring(query FROM 70 FOR 25), state, backend_xid FROM pg_stat_activity WHERE datname = 'test' AND query NOT ILIKE '%pg_stat_activity%' ORDER BY query_start desc; and we get pid | age | usename | substring | state | backend_xid -------+-----------------+----------+---------------------------+---------------------+------------- 10468 | 00:36:57.592779 | postgres | cmax, ctid, x, y FROM t; | idle in transaction | 578 (1 row) ========================================================= Now, we see that our txn 578 (which is now > 36 mins old) is still sitting there in an idle state. So, everything's rosy in the garden... now, back to session 1... I rerun my SELECT within txn 578 =========================================================== Session 1 -> SELECT txid_current(), txid_status(txid_current()), xmin, xmax, cmin, cmax, ctid, x, y FROM t; txid_current | txid_status | xmin | xmax | cmin | cmax | ctid | x | y --------------+-------------+------+------+------+------+-------+---+----- 578 | in progress | 527 | 0 | 0 | 0 | (0,1) | 1 | One 578 | in progress | 528 | 0 | 0 | 0 | (0,2) | 2 | Two 578 | in progress | 579 | 0 | 0 | 0 | (0,4) | 3 | Trí (3 rows) ========================================================== So, now the tuple where x = 3 has a value of 'Trí' for y. This is the crux of the TRANSACTION ISOLATION LEVEL being RC - txn 578 **can see** the result of txn 579 (from session 2)! ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ We change the TRANSACTION ISOLATION LEVEL to S in session 1. I start by issuing the command: SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL SERIALIZABLE; and then start a txn and SELECT from t as follows: ======================================================= Session 1 -> START TRANSACTION; START TRANSACTION Session 1 -> SHOW TRANSACTION ISOLATION LEVEL; transaction_isolation ----------------------- serializable (1 row) Session 1 -> SELECT txid_current(), txid_status(txid_current()), xmin, xmax, cmin, cmax, ctid, x, y FROM t; txid_current | txid_status | xmin | xmax | cmin | cmax | ctid | x | y --------------+-------------+------+------+------+------+-------+---+----- 582 | in progress | 527 | 0 | 0 | 0 | (0,1) | 1 | One 582 | in progress | 528 | 0 | 0 | 0 | (0,2) | 2 | Two 582 | in progress | 579 | 0 | 0 | 0 | (0,4) | 3 | Trí (3 rows) ====================================================================== Txn 582 (S) has been started and has tuple 3 as (3, 'Trí'). Back to session 2 and I issue the pg_stats_activity command ============================================================= SELECT pid, age(clock_timestamp(), query_start), usename, substring(query FROM 70 FOR 25), state, backend_xid FROM pg_stat_activity WHERE datname = 'test' AND query NOT ILIKE '%pg_stat_activity%' ORDER BY query_start desc; which shows pid | age | usename | substring | state | backend_xid -------+-----------------+----------+---------------------------+---------------------+------------- 10468 | 00:01:09.705916 | postgres | cmax, ctid, x, y FROM t; | idle in transaction | 582 (1 row) ==================================================================================== So, we can see that our session 1 transaction (582, S) is happily sitting there idling away... Again from session 2, I UPDATE the (3, 'Trí') tuple back to (3, 'Three') - no START TRANSACTION - the session 2 txn and the COMMIT; are implicit. ================================================================================ Session 2 -> UPDATE t SET y = 'Three' WHERE x = 3; UPDATE 1 then check my UPDATE from within session 2 Session 2 -> SELECT txid_current(), txid_status(txid_current()), xmin, xmax, cmin, cmax, ctid, x, y FROM t; txid_current | txid_status | xmin | xmax | cmin | cmax | ctid | x | y --------------+-------------+------+------+------+------+-------+---+------- 584 | in progress | 527 | 0 | 0 | 0 | (0,1) | 1 | One 584 | in progress | 528 | 0 | 0 | 0 | (0,2) | 2 | Two 584 | in progress | 583 | 0 | 0 | 0 | (0,5) | 3 | Three (3 rows) ================================================================================= Within session 2, I can now see my changed data - (3, 'Trí') has reverted to (3, 'Three'). I go back to session 1 and reissue the command above ================================================================================ Session 1 -> SELECT txid_current(), txid_status(txid_current()), xmin, xmax, cmin, cmax, ctid, x, y FROM t; txid_current | txid_status | xmin | xmax | cmin | cmax | ctid | x | y --------------+-------------+------+------+------+------+-------+---+----- 582 | in progress | 527 | 0 | 0 | 0 | (0,1) | 1 | One 582 | in progress | 528 | 0 | 0 | 0 | (0,2) | 2 | Two 582 | in progress | 579 | 583 | 0 | 0 | (0,4) | 3 | Trí (3 rows) ================================================================================ So, despite the fact that the S txn 582 is > 2 mins old and has been modified by the UPDATE in session 2, the data has remained unchanged from the perspective of the 582 S txn in session 1 Except for xmax which has incremented to 583, the tx_id of the UPDATE txn from session 2 - serves as notice that the tuple has been updated by a later txn! This is the distinction between READ COMMITTED and SERIALIZABLE. Now, I appreciate that there are performance/resource issues involved in escalating from RC to S levels - but that's a different day's work. I think that the first important thing to realise (at least it took me a while) is that transaction isolation levels only apply to multi-statement transactions - a single SQL statement txn only has one view of the database at the instant it is run and it will see all committed data at that point - txn isolation doesn't apply (I'm ignoring the possibility of dirty-reads which are the spawn of Satan...). Again, thanks for having made me think about my explanation and about the topic generally - I've heard it said and seen it written that you can't say that you understand something unless you can explain it clearly - apparently the Einstein quote is misattributed - I like Rutherford's one though: > https://www.linkedin.com/pulse/albert-einsteins-birthday-ode-misattributions-peter-smirniotopoulos I realise that there are potential added complications if session 1 decides to update table t, tuple (3, 'Trí') - not sure what happens then? Watch this space... Hopefully I have explained it clearly... Rgs, Pól... > David J.
В списке pgsql-novice по дате отправления:
Предыдущее
От: "David G. Johnston"Дата:
Сообщение: Re: Transaction ISOLATION LEVEL - have I missed something?