Обсуждение: PG Manual: Clarifying the repeatable read isolation example

Поиск
Список
Период
Сортировка

PG Manual: Clarifying the repeatable read isolation example

От
Evan Jones
Дата:
Feel free to flame me if I should be posting this elsewhere, but after reading the "submitting a patch" guide, it
appearsI should ask for guidance here. 


I was reading the Postgres MVCC documentation today (which is generally fantastic BTW), and am slightly confused by a
singlesentence example, describing possible read-only snapshot isolation anomalies. I would like to submit a patch to
clarifythis example, since I suspect others may be also confused, but to do that I need help understanding it. The
examplewas added as part of the Serializable Snapshot Isolation patch. 

Link to the commit: http://git.postgresql.org/gitweb/?p=postgresql.git;h=dafaa3efb75ce1aae2e6dbefaf6f3a889dea0d21


I'm referring to the following sentence of 13.2.2, which is still in the source tree:

http://www.postgresql.org/docs/devel/static/transaction-iso.html#XACT-REPEATABLE-READ

"For example, even a read only transaction at this level may see a control record updated to show that a batch has been
completedbut not see one of the detail records which is logically part of the batch because it read an earlier revision
ofthe control record." 


I do not understand how this example anomaly is possible. I'm imagining something like the following:

1. Do a bunch of work, possibly in parallel in multiple transactions, that insert/update a bunch of detail records.
2. After all that work commits, insert or update a record in the "control" table indicating that the batch completed.

Or maybe:

1. Do a batch of work and update the "control" table in a single transaction.


The guarantee that I believe REPEATABLE READ will give you in either of these case is that if you see the "control"
tablerecord, you will read all the detail records, because the control record is only written if the updated detail
recordshave been committed. What am I not understanding? 


The most widely cited read-only snapshot isolation example is the bank withdrawl example from this paper:
http://www.sigmod.org/publications/sigmod-record/0409/2.ROAnomONeil.pdf. However, I suspect we can present an anomaly
thatdoesn't require as much explanation? 

Thanks,

Evan Jones

--
Work: https://www.mitro.co/    Personal: http://evanjones.ca/




Re: PG Manual: Clarifying the repeatable read isolation example

От
Heikki Linnakangas
Дата:
On 05/27/2014 10:12 PM, Evan Jones wrote:
> I was reading the Postgres MVCC documentation today (which is
> generally fantastic BTW), and am slightly confused by a single
> sentence example, describing possible read-only snapshot isolation
> anomalies. I would like to submit a patch to clarify this example,
> since I suspect others may be also confused, but to do that I need
> help understanding it. The example was added as part of the
> Serializable Snapshot Isolation patch.
>
> Link to the commit:
> http://git.postgresql.org/gitweb/?p=postgresql.git;h=dafaa3efb75ce1aae2e6dbefaf6f3a889dea0d21
>
>
>
> I'm referring to the following sentence of 13.2.2, which is still in
> the source tree:
>
> http://www.postgresql.org/docs/devel/static/transaction-iso.html#XACT-REPEATABLE-READ
>
>  "For example, even a read only transaction at this level may see a
> control record updated to show that a batch has been completed but
> not see one of the detail records which is logically part of the
> batch because it read an earlier revision of the control record."

Hmm, that seems to be a super-summarized description of what Kevin & Dan 
called the "receipts problem". There's an example of that in the 
isolation test suite, see src/test/isolation/specs/receipt-report.spec. 
Googling for it, I also found an academic paper written by Kevin & Dan 
that illustrates it: http://arxiv.org/pdf/1208.4179.pdf, "2.1.2 Example 
2: Batch Processing". (Nice work, I didn't know of that paper until now!)

I agree that's too terse. I think it would be good to actually spell out 
a complete example of the Receipt problem in the manual. That chapter in 
the manual contains examples of anomalities in Read Committed mode, so 
it would be good to give a concrete example of an anomaly in Repeatable 
Read mode too. Want to write up a docs patch?

- Heikki



Re: PG Manual: Clarifying the repeatable read isolation example

От
Evan Jones
Дата:
Oh yeah, I shared an office with Dan so I should have thought to check their paper. Oops. Thanks for the suggestion;
I'lltry to summarize this into something that is similar to the Read Committed and Serializable mode examples. It may
takeme a week or two to find the time, but thanks for the suggestions. 

Evan


On May 27, 2014, at 15:32 , Heikki Linnakangas <hlinnakangas@vmware.com> wrote:

> I agree that's too terse. I think it would be good to actually spell out a complete example of the Receipt problem in
themanual. That chapter in the manual contains examples of anomalities in Read Committed mode, so it would be good to
givea concrete example of an anomaly in Repeatable Read mode too. Want to write up a docs patch? 


--
Work: https://www.mitro.co/    Personal: http://evanjones.ca/




Re: PG Manual: Clarifying the repeatable read isolation example

От
David G Johnston
Дата:
Heikki Linnakangas-6 wrote
> On 05/27/2014 10:12 PM, Evan Jones wrote:
>> I was reading the Postgres MVCC documentation today (which is
>> generally fantastic BTW), and am slightly confused by a single
>> sentence example, describing possible read-only snapshot isolation
>> anomalies. I would like to submit a patch to clarify this example,
>> since I suspect others may be also confused, but to do that I need
>> help understanding it. The example was added as part of the
>> Serializable Snapshot Isolation patch.
>>
>> Link to the commit:
>> http://git.postgresql.org/gitweb/?p=postgresql.git;h=dafaa3efb75ce1aae2e6dbefaf6f3a889dea0d21
>>
>>
>>
>> I'm referring to the following sentence of 13.2.2, which is still in
>> the source tree:
>>
>> http://www.postgresql.org/docs/devel/static/transaction-iso.html#XACT-REPEATABLE-READ
>>
>>  "For example, even a read only transaction at this level may see a
>> control record updated to show that a batch has been completed but
>> not see one of the detail records which is logically part of the
>> batch because it read an earlier revision of the control record."
> 
> Hmm, that seems to be a super-summarized description of what Kevin & Dan 
> called the "receipts problem". There's an example of that in the 
> isolation test suite, see src/test/isolation/specs/receipt-report.spec. 
> Googling for it, I also found an academic paper written by Kevin & Dan 
> that illustrates it: http://arxiv.org/pdf/1208.4179.pdf, "2.1.2 Example 
> 2: Batch Processing". (Nice work, I didn't know of that paper until now!)
> 
> I agree that's too terse. I think it would be good to actually spell out 
> a complete example of the Receipt problem in the manual. That chapter in 
> the manual contains examples of anomalities in Read Committed mode, so 
> it would be good to give a concrete example of an anomaly in Repeatable 
> Read mode too. Want to write up a docs patch?

While this is not a doc patch I decided to give it some thought.  The "bank"
example was understandable enough for me so I simply tried to make it more
accessible.  I also didn't go and try to get it to conform to other,
existing, examples.  This is intended to replace the entire "For example..."
paragraph noted above.


While Repeatable Read provides for stable in-transaction reads logical query
anomalies can result because commit order is not restricted and
serialization errors only occur if two transactions attempt to modify the
same record.

Consider a rule that, upon updating r1 OR r2, if r1+r2 < 0 then subtract an
additional 1 from the corresponding row.
Initial State: r1 = 0; r2 = 0
Transaction 1 Begins: reads (0,0); adds -10 to r1, notes r1 + r2 will be -10
and subtracts an additional 1
Transaction 2 Begins: reads (0,0); adds 20 to r2, notes r1 + r2 will be +20;
no further action needed
Commit 2
Transaction 3: reads (0,20) and commits
Commit 1
Transaction 4: reads (-11,20) and commits

However, if Transaction 2 commits first then, logically, the calculation of
r1 + r2 in Transaction 1 should result in a false outcome and the additional
subtraction of 1 should not occur - leaving T4 reading (-10,20).  

The ability for out-of-order commits is what allows T3 to read the pair
(0,20) which is logically impossible in the T2->before->T1 commit order with
T4 reading (-11,20).

Neither transaction fails since a serialization failure only occurs if a
concurrent update occurs to [ r1 (in T1) ] or to [ r2 (in T2) ]; The update
of [ r2 (in T1) ] is invisible - i.e., no failure occurs if a read value
undergoes a change.


Inspired by:
http://www.sigmod.org/publications/sigmod-record/0409/2.ROAnomONeil.pdf -
Example 1.3


David J.




--
View this message in context:
http://postgresql.1045698.n5.nabble.com/PG-Manual-Clarifying-the-repeatable-read-isolation-example-tp5805152p5805170.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.



Re: PG Manual: Clarifying the repeatable read isolation example

От
Kevin Grittner
Дата:
David G Johnston <david.g.johnston@gmail.com> wrote:

>>>   "For example, even a read only transaction at this level may see a
>>> control record updated to show that a batch has been completed but
>>> not see one of the detail records which is logically part of the
>>> batch because it read an earlier revision of the control record."
>>
>> Hmm, that seems to be a super-summarized description of what Kevin & Dan
>> called the "receipts problem". There's an example of that in the
>> isolation test suite, see src/test/isolation/specs/receipt-report.spec.

It is also one of the examples I provided on the SSI Wiki page:

https://wiki.postgresql.org/wiki/SSI#Deposit_Report
 
>> Googling for it, I also found an academic paper written by Kevin & Dan
>> that illustrates it: http://arxiv.org/pdf/1208.4179.pdf, "2.1.2 Example
>> 2: Batch Processing". (Nice work, I didn't know of that paper until now!)

There were links to drafts of the paper in July, 2012, but I guess
the official location in the Proceedings of the VLDB Endowment was
never posted to the community lists.  That's probably worth having
on record here:

http://vldb.org/pvldb/vol5/p1850_danrkports_vldb2012.pdf

>> I agree that's too terse. I think it would be good to actually spell out
>> a complete example of the Receipt problem in the manual. That chapter in
>> the manual contains examples of anomalities in Read Committed mode, so
>> it would be good to give a concrete example of an anomaly in Repeatable
>> Read mode too.

I found it hard to decide how far to go in the docs versus the Wiki
page.  Any suggestions or suggested patches welcome.

> While this is not a doc patch I decided to give it some thought.  The "bank"
> example was understandable enough for me so I simply tried to make it more
> accessible.  I also didn't go and try to get it to conform to other,
> existing, examples.  This is intended to replace the entire "For example..."
> paragraph noted above.
>
> While Repeatable Read provides for stable in-transaction reads logical query
> anomalies can result because commit order is not restricted and
> serialization errors only occur if two transactions attempt to modify the
> same record.
>
> Consider a rule that, upon updating r1 OR r2, if r1+r2 < 0 then subtract an
> additional 1 from the corresponding row.
> Initial State: r1 = 0; r2 = 0
> Transaction 1 Begins: reads (0,0); adds -10 to r1, notes r1 + r2 will be -10
> and subtracts an additional 1
> Transaction 2 Begins: reads (0,0); adds 20 to r2, notes r1 + r2 will be +20;
> no further action needed
> Commit 2
> Transaction 3: reads (0,20) and commits
> Commit 1
> Transaction 4: reads (-11,20) and commits
>
> However, if Transaction 2 commits first then, logically, the calculation of
> r1 + r2 in Transaction 1 should result in a false outcome and the additional
> subtraction of 1 should not occur - leaving T4 reading (-10,20).
>
> The ability for out-of-order commits is what allows T3 to read the pair
> (0,20) which is logically impossible in the T2->before->T1 commit order with
> T4 reading (-11,20).
>
> Neither transaction fails since a serialization failure only occurs if a
> concurrent update occurs to [ r1 (in T1) ] or to [ r2 (in T2) ]; The update
> of [ r2 (in T1) ] is invisible - i.e., no failure occurs if a read value
> undergoes a change.
>
> Inspired by:
> http://www.sigmod.org/publications/sigmod-record/0409/2.ROAnomONeil.pdf -
> Example 1.3

I know this is subjective, but that seems to me a little too much
in an academic style for the docs.  In the Wiki page examples I
tried to use a style more accessible to DBAs and application
programmers.  Don't get me wrong, I found various papers by Alan
Fekete and others very valuable while working on the feature, but
they are often geared more toward those developing such features
than those using them.

That said, I know I'm not the best word-smith in the community, and
would very much welcome suggestions from others on the best way to
cover this.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: PG Manual: Clarifying the repeatable read isolation example

От
Gavin Flower
Дата:
On 08/06/14 05:03, Kevin Grittner wrote:
[...]
> I found it hard to decide how far to go in the docs versus the Wiki 
> page.  Any suggestions or suggested patches welcome.
[...]
> I know this is subjective, but that seems to me a little too much in 
> an academic style for the docs.  In the Wiki page examples I tried to 
> use a style more accessible to DBAs and application programmers.  
> Don't get me wrong, I found various papers by Alan Fekete and others 
> very valuable while working on the feature, but they are often geared 
> more toward those developing such features than those using them. That 
> said, I know I'm not the best word-smith in the community, and would 
> very much welcome suggestions from others on the best way to cover 
> this. -- Kevin Grittner EDB: http://www.enterprisedb.com The 
> Enterprise PostgreSQL Company 

I know that I first look at the docs & seldom look at the Wiki - in fact 
it was only recently that I became aware of the Wiki, and it is still 
not the first thing I think of when I want to know something, and I 
often forget it exists.  I suspect many people are like me in this!

Also the docs have a more authoritative air, and probably automatically 
assumed to be more up-to-date and relevant to the version of Postgres used.

So I suggest that the docs should have an appropriate coverage of such 
topics, possibly mostly in an appendix with brief references in affected 
parts of the main docs) if it does not quite fit into the rest of the 
documentation (affects many different features, so no one place in the 
main docs is appropriate - or too detailed, or too much).  Also links to 
the Wiki, and to the more academic papers, could be provided for the 
really keen.


Cheers,
Gavin




Re: PG Manual: Clarifying the repeatable read isolation example

От
Bruce Momjian
Дата:
On Sun, Jun  8, 2014 at 09:33:04AM +1200, Gavin Flower wrote:
> I know that I first look at the docs & seldom look at the Wiki - in
> fact it was only recently that I became aware of the Wiki, and it is
> still not the first thing I think of when I want to know something,
> and I often forget it exists.  I suspect many people are like me in
> this!
> 
> Also the docs have a more authoritative air, and probably
> automatically assumed to be more up-to-date and relevant to the
> version of Postgres used.
> 
> So I suggest that the docs should have an appropriate coverage of
> such topics, possibly mostly in an appendix with brief references in
> affected parts of the main docs) if it does not quite fit into the
> rest of the documentation (affects many different features, so no
> one place in the main docs is appropriate - or too detailed, or too
> much).  Also links to the Wiki, and to the more academic papers,
> could be provided for the really keen.

You can link to the wiki from our docs.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +