Обсуждение: Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle

Поиск
Список
Период
Сортировка

Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle

От
Florian Pflug
Дата:
Hi

After the recent discussion about the impossibility of efficiently implementing FK-like constraint triggers in PL/PGSQL
thatwork correctly under SERIALIZABLe transactions, I've compared our behavior to that of Oracle. As it turns out, a
slightdifference in Oracle's behavior makes those FK constraint triggers which on postgres are only correct in READ
COMMITTEDmode fully correct in SERIALIZABLE mode also. 

1. Summary of the previous discussion

The built-in FK constraint trigger looks for rows visible under either the transaction's snapshot *or* a freshly taken
MVCCsnapshot when checking for child-table rows that'd prevent an UPDATE or DELETE of a row in the parent table. This
isnecessary even though the parent row is SHARE-locked on INSERTs/UPDATEs to the child table, and would also be
necessaryif it was UPDATE-locked. The following series of commands illustrates why 

C1: BEGIN
C1: SELECT * FROM t WHERE id = 1 FOR UPDATE
C2: BEGIN
C2: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
C2: SELECT * FROM t -- Take snapshot before C1 commits
C1: COMMIT
C2: DELETE FROM t WHERE id = 1
C2: COMMIT

Since C1 commits before C2 does DELETE, C2 is entirely unaffected by C1's UPDATE-lock. C2 has no way of detecting
possibledependent rows that C1 might have inserted, since C1 is invisible to C2. 

Note that if you swap the SELECT .. FOR UPDATE and the DELETE commands, the SELECT .. FOR UPDATE will cause a
serializationerror! 

2. The behavior or Oracle

Oracle treats a "FOR UPDATE" lock much like an actual UPDATE when checking for serialization conflicts. This causes the
DELETEin the example above to raise a serialization error, and hence prevents the failure case for FK constraint
triggerseven without a recheck under a current snapshot. 

One can think of a FOR UPDATE lock as a kind of read barrier on Oracle - it prevents other transactions from messing
withthe row that don't consider the locking transaction to be visible. 

3. Conclusio

While it might seem strange at first for a lock to affect other transactions even after the locking transaction has
ended,it actually makes sense when viewed as a kind of write barrier. It is very common for locking primitives to use
barrierinstructions to ensure that one lock holder sees all changes done by the previous owner. Raising a serialization
errorin the example above is the transactional equivalent of such a barrier instruction in the case of SERIALIZABLE
transactions- since updating the transaction's snapshot is obviously not an option, the remaining alternative is to
restartthe whole transaction under a current snapshot. This is exactly what raising a serialization error accomplishes. 

Also, while Oracle's behavior has obvious use-cases (e.g. FK-like constraints), I failed to come up with a case where
postgres'current behavior is useful. When would you want a (SERIALIZABLE) transaction to wait for a lock, but then
continueas if the lock had never existed? What is the point of waiting then in the first place? 

All in all, I believe that SHARE and UPDATE row-level locks should be changed to cause concurrent UPDATEs to fail with
aserialization error. I can come up with a patch that does that, but I wanted to get some feedback on the idea before I
putthe work in. 

best regards,
Florian Pflug



Re: Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle

От
Tom Lane
Дата:
Florian Pflug <fgp@phlo.org> writes:
> All in all, I believe that SHARE and UPDATE row-level locks should be
> changed to cause concurrent UPDATEs to fail with a serialization
> error.

I don't see an argument for doing that for FOR SHARE locks, and it
already happens for FOR UPDATE (at least if the row actually gets
updated).  AFAICS this proposal mainly breaks things, in pursuit of
an unnecessary and probably-impossible-anyway goal of making FK locking
work with only user-level snapshots.
        regards, tom lane


Re: Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle

От
"Kevin Grittner"
Дата:
Florian Pflug <fgp@phlo.org> wrote:
> All in all, I believe that SHARE and UPDATE row-level locks should
> be changed to cause concurrent UPDATEs to fail with a
> serialization error. I can come up with a patch that does that,
> but I wanted to get some feedback on the idea before I put the
> work in.
Before you work on that, you might want to wait until you can review
the work that I and Dan Ports (a Ph.D. candidate from MIT) have been
doing on support for true serializable transactions.  You don't need
to use FOR SHARE or FOR UPDATE or any explicit locks as long as the
concurrent transactions are SERIALIZABLE.  We have it working, but
have been holding off on discussion or patch submission at Tom's
request -- he felt it would distract from the process of getting the
release out.
Whenever people are ready, I can submit a WIP patch.  All issues
discuss on this thread "Just Work" with the patch applied.  There's
a Wiki page and a public git repository related to this work, for
anyone who is interested and not busy with release work.
-Kevin


Re: Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle

От
Florian Pflug
Дата:
On May 13, 2010, at 23:39 , Tom Lane wrote:
> Florian Pflug <fgp@phlo.org> writes:
>> All in all, I believe that SHARE and UPDATE row-level locks should be
>> changed to cause concurrent UPDATEs to fail with a serialization
>> error.
>
> I don't see an argument for doing that for FOR SHARE locks, and it
> already happens for FOR UPDATE (at least if the row actually gets
> updated).

Yes, actually updating the row is a workaround. A prohibitively expensive one, though.

The arguments are as stated

a) SHARE or UPDATE locking a concurrently updated row *does* cause as serialization error, making the current behavior
asymmetric

b) Locking primitives usually ensure that once you obtain the lock you see the most recent version of the data. This is
currentlytrue for READ COMMITTED transactions but not for SERIALIZABLE ones, and pretty undesirable a behavior for a
lockingprimitive. 

c) I fail to see how the current behavior is useful in the presence of SERIALIZABLE transactions. Currently, they could
IMHOcompletely ignore FOR SHARE locks, without making any previously correct algorithm incorrect. 

plus a weaker one:

d) Oracle does it for FOR UPDATE locks, and actually has an example of a FK trigger in PL/SQL in their docs.

> AFAICS this proposal mainly breaks things, in pursuit of
> an unnecessary and probably-impossible-anyway goal of making FK locking
> work with only user-level snapshots.

I don't see the breakage this'd cause. For READ COMMITTED transactions nothing changes. For SERIALIZABLE transactions
thebehavior of FOR UPDATE / FOR SHARE becomes much easier to grasp. In both cases a SHARE lock would then say "Only
updatethis row if you have seen the locking transaction's changes". 

Why do you think that making FK locking work with only user-level snapshots is probably-impossible-anyway? With the
proposedchanges, simply FOR SHARE locking the parent row on INSERT/UPDATE of the child, plus checking for child rows on
UPDATE/DELETEof the parent gives a 100% correct FK trigger. 

I do not have a formal proof for that last assertion, but I'm not aware of any counter-examples either. Would love to
hearof any, though. 

best regards,
Florian Pflug



Re: Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle

От
Florian Pflug
Дата:
On May 13, 2010, at 23:51 , Kevin Grittner wrote:

> Florian Pflug <fgp@phlo.org> wrote:
>
>> All in all, I believe that SHARE and UPDATE row-level locks should
>> be changed to cause concurrent UPDATEs to fail with a
>> serialization error. I can come up with a patch that does that,
>> but I wanted to get some feedback on the idea before I put the
>> work in.
>
> Before you work on that, you might want to wait until you can review
> the work that I and Dan Ports (a Ph.D. candidate from MIT) have been
> doing on support for true serializable transactions.  You don't need
> to use FOR SHARE or FOR UPDATE or any explicit locks as long as the
> concurrent transactions are SERIALIZABLE.  We have it working, but
> have been holding off on discussion or patch submission at Tom's
> request -- he felt it would distract from the process of getting the
> release out.

I'm very exited about the work you're doing there, and believe it'd be a great feature to have.

However, I view my proposal as pretty orthogonal to that work. True serializable transaction are much more powerful
thanwhat I proposed, but at a much higher price too, due to the necessity of SIREAD locks. My proposal allows for
simpleFK-like constraints to be implemented at user-level that are correct for all isolation levels. 

best regards,
Florian Pflug



Re: Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle

От
Greg Stark
Дата:
On Thu, May 13, 2010 at 10:25 PM, Florian Pflug <fgp@phlo.org> wrote:
> C1: BEGIN
> C1: SELECT * FROM t WHERE id = 1 FOR UPDATE
> C2: BEGIN
> C2: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
> C2: SELECT * FROM t -- Take snapshot before C1 commits
> C1: COMMIT
> C2: DELETE FROM t WHERE id = 1
> C2: COMMIT
>

Can you give an actual realistic example -- ie, not doing a select for
update and then never updating the row or with an explanation of what
the programmer is attempting to accomplish with such an unusual
sequence? The rest of the post talks about FKs but I don't see any
here...

-- 
greg


Re: Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle

От
Anssi Kääriäinen
Дата:
On 05/14/2010 03:37 AM, Greg Stark wrote:> On Thu, May 13, 2010 at 10:25 PM, Florian Pflug<fgp@phlo.org>  wrote:>> C1:
BEGIN>>C1: SELECT * FROM t WHERE id = 1 FOR UPDATE>> C2: BEGIN>> C2: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE>> C2:
SELECT* FROM t -- Take snapshot before C1 commits>> C1: COMMIT>> C2: DELETE FROM t WHERE id = 1>> C2: COMMIT>>>> Can
yougive an actual realistic example -- ie, not doing a select for> update and then never updating the row or with an
explanationof what> the programmer is attempting to accomplish with such an unusual> sequence? The rest of the post
talksabout FKs but I don't see any> here...> 

Doing a select for update and then never updating the row is a realistic
example.

I am currently designing a database where this is an issue. The
simplified schema to illustrate the problem:

create table object (   id integer primary key
);

insert into object values(1);

create table attribute (   object_id integer not null references object,   attr_type integer not null, -- references
attr_types  value text not null,   valid_from timestamp not null,   valid_until timestamp 
);

Now, I want to make sure there are no pairs of (object_id, attr_type)
where the valid_from, valid_until times overlap.

A problematic sequence for this schema, both transactions in isolation
level serializable:


C1: begin;
C1: select * from object where id = 1 for update;
-- check for conflicting attr_type, realistically where condition should
have overlapping check, but left out for simplicity...
C1: select * from attribute where object_id = 1 and attr_type = 1;
-- Ok, nothing overlapping, I am able to insert.
C1: insert into attribute values (1, 1, 'Anssi', now(), null);
C2: begin;
-- This blocks.
C2: select * from object where id = 1 for update;
C1: commit;
-- Check for conflicts. This select won't see the insert C1 did.
C2: select * from attribute where object_id = 1 and attr_type = 1;
-- C2 doesn't see anything conflicting
C2: insert into attribute values (1, 1, 'Matti', now(), null);
C2: commit;
-- Inconsistency.

Now, that same sequence does work for read committed isolation level (C2
sees the insert of C1), and that is my solution for now: require
applications to use read committed isolation level. This could also be
solved by issuing "update object set id = id where id = 1" instead of
using select for update. This would result in serialization error.

I know that for this particular example the upcoming exclusion
constraints would solve the problem. But if I would want to ensure that
if attr_value for attr_type 1 is 'Anssi' then attr_value for attr_type 2
is 'Kääriäinen', then exclusion constraints could not be used.

--
Anssi Kääriäinen


Re: Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle

От
Nicolas Barbier
Дата:
2010/5/14 Greg Stark <gsstark@mit.edu>:

> On Thu, May 13, 2010 at 10:25 PM, Florian Pflug <fgp@phlo.org> wrote:
>
>> C1: BEGIN
>> C1: SELECT * FROM t WHERE id = 1 FOR UPDATE
>> C2: BEGIN
>> C2: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
>> C2: SELECT * FROM t -- Take snapshot before C1 commits
>> C1: COMMIT
>> C2: DELETE FROM t WHERE id = 1
>> C2: COMMIT
>
> Can you give an actual realistic example -- ie, not doing a select for
> update and then never updating the row or with an explanation of what
> the programmer is attempting to accomplish with such an unusual
> sequence? The rest of the post talks about FKs but I don't see any
> here...

The link with FKs is as follows:

* The example does not use a real FK, because the whole purpose is to
do the same as FKs while not using the FK machinery.
* The example uses only one table, because that is enough to
illustrate the problem (see next items).
* C1 locks a row, supposedly because it wants to create a reference to
it in a non-mentioned table, and wants to prevent the row from being
deleted under it.
* C2 deletes that row (supposedly after it verified that there are no
references to it; it would indeed not be able to see the reference
that C1 created/would create), and C1 fails to detect that.
* C2 also fails to detect the problem, because the lock that C1 held
is being released after C1 commits, and C2 can happily go on deleting
the row.
* The end result is that the hypothetical reference is created,
although the referent is gone.

Nicolas


Re: Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle

От
Florian Pflug
Дата:
On May 14, 2010, at 2:37 , Greg Stark wrote:

> On Thu, May 13, 2010 at 10:25 PM, Florian Pflug <fgp@phlo.org> wrote:
>> C1: BEGIN
>> C1: SELECT * FROM t WHERE id = 1 FOR UPDATE
>> C2: BEGIN
>> C2: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
>> C2: SELECT * FROM t -- Take snapshot before C1 commits
>> C1: COMMIT
>> C2: DELETE FROM t WHERE id = 1
>> C2: COMMIT
>>
>
> Can you give an actual realistic example -- ie, not doing a select for
> update and then never updating the row or with an explanation of what
> the programmer is attempting to accomplish with such an unusual
> sequence? The rest of the post talks about FKs but I don't see any
> here...

The table "t" is supposed to represent the parent table of a FK constraint. The SELECT FOR UPDATE is done upon an
INSERTto the child table to protect the parent row against concurrent deletion. I've used FOR UPDATE instead of FOR
SHAREbecause I did test this against oracle also, and oracle does not support FOR SHARE.  

Here's a full example of a pair of FK triggers in PL/PGSQL that work correctly in READ COMMITTED mode but fail to
enforcethe constraint in SERIALIZABLE mode as the following sequence of commands show. With my proposal, the DELETE
wouldagain raise a serialization error and hence keep the constraint satisfied. 

C1: BEGIN
C1: INSERT INTO child (parent_id) VALUES (1) -- Locks the parent row FOR UPDATE
C2: BEGIN
C2: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
C2: SELECT TRUE -- Take snapshot *before* C1 commits
C1: COMMIT
C2: DELETE FROM parent WHERE parent_id = 1 -- Succeeds
C2: COMMIT

----------
CREATE TABLE parent (parent_id SERIAL NOT NULL PRIMARY KEY);
CREATE TABLE child (child_id SERIAL NOT NULL PRIMARY KEY, parent_id INTEGER NOT NULL);

CREATE FUNCTION ri_parent() RETURNS TRIGGER AS $body$
BEGINPERFORM TRUE FROM child WHERE parent_id = OLD.parent_id;IF FOUND THEN  RAISE SQLSTATE '23503' USING MESSAGE =
'Parent' || OLD.parent_id || ' still referenced during ' || TG_OP;END IF;RETURN NULL; 
END;
$body$ LANGUAGE PLPGSQL VOLATILE;
CREATE TRIGGER ri_parent AFTER UPDATE OR DELETE ON parent FOR EACH ROW EXECUTE PROCEDURE ri_parent();

CREATE FUNCTION ri_child() RETURNS TRIGGER AS $body$
BEGINPERFORM TRUE FROM parent WHERE parent_id = NEW.parent_id FOR UPDATE OF parent;IF NOT FOUND THEN  RAISE SQLSTATE
'23503'USING MESSAGE = 'Parent ' || NEW.parent_id || ' does not exist during ' || TG_OP;END IF;RETURN NULL; 
END;
$body$ LANGUAGE PLPGSQL VOLATILE;
CREATE TRIGGER ri_child AFTER INSERT OR UPDATE ON child FOR EACH ROW EXECUTE PROCEDURE ri_child();
----------

best regards,

Florian Pflug



Re: Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle

От
Robert Haas
Дата:
On Thu, May 13, 2010 at 5:39 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Florian Pflug <fgp@phlo.org> writes:
>> All in all, I believe that SHARE and UPDATE row-level locks should be
>> changed to cause concurrent UPDATEs to fail with a serialization
>> error.
>
> I don't see an argument for doing that for FOR SHARE locks, and it
> already happens for FOR UPDATE (at least if the row actually gets
> updated).  AFAICS this proposal mainly breaks things, in pursuit of
> an unnecessary and probably-impossible-anyway goal of making FK locking
> work with only user-level snapshots.

After giving this considerable thought and testing the behavior at
some length, I think the OP has it right.  One thing I sometimes need
to do is denormalize a copy of a field, e.g.

CREATE TABLE parent (id serial, mode integer not null, primary key (id));
CREATE TABLE child (id serial, parent_id integer not null references
parent (id), parent_mode integer not null);

The way I have typically implemented this in the past is:

1. Add a trigger to the parent table so that, whenever the mode column
gets updated, we do an update on the parent_mode of all children.
2. Add a trigger to the child table so that, when a new child is
inserted, it initializes parent_mode from its parent.  I do SELECT
with FOR UPDATE on the parent parent can't change under me; though FOR
SHARE ought to be enough also since we're just trying to lock out
concurrent updates.

Suppose T1 updates the parent's mode while T2 adds a new child; then
both commit.  In read committed mode, this seems to work OK regardless
of the order of T1 and T2.  If T1 grabs the lock first, then T2 sees
the updated version of the row after T1 commits.  If T2 grabs the lock
first, then the update on the parent blocks until the child commits.
Subsequently, when the trigger fires, it apparently uses an up-to-date
snapshot, so the new child is updated also.  In serializable mode,
things are not so good.  If T1 grabs the lock first, the child waits
to see whether it commits or aborts.  On commit, it complains that it
can't serialize and aborts, which is reasonable - transaction aborts
are the price you pay for serializability.  If T2 grabs the lock
first, the update on the parent blocks as before, but now the update
is done with the old snapshot and ignores the new child, so the new
child now has a value for parent_mode that doesn't match the parent's
actual mode.  That is, you get the wrong answer due to a serialization
anomaly that didn't existed at the read committed level.

Increasing the transaction isolation level is supposed to *eliminate*
serialization anomalies, not create them.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle

От
Florian Pflug
Дата:
On May 14, 2010, at 22:54 , Robert Haas wrote:
> On Thu, May 13, 2010 at 5:39 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Florian Pflug <fgp@phlo.org> writes:
>>> All in all, I believe that SHARE and UPDATE row-level locks should be
>>> changed to cause concurrent UPDATEs to fail with a serialization
>>> error.
>>
>> I don't see an argument for doing that for FOR SHARE locks, and it
>> already happens for FOR UPDATE (at least if the row actually gets
>> updated).  AFAICS this proposal mainly breaks things, in pursuit of
>> an unnecessary and probably-impossible-anyway goal of making FK locking
>> work with only user-level snapshots.
>
> After giving this considerable thought and testing the behavior at
> some length, I think the OP has it right.  One thing I sometimes need
> to do is denormalize a copy of a field, e.g.
>
> <snipped example>

I've whipped up a quick and still rather dirty patch that implements the behavior I proposed, at least for the case of
conflictsbetween FOR UPDATE locks and updates. With the patch, any attempt to UPDATE or FOR UPDATE lock a row that has
concurrentlybeen FOR UPDATE locked will cause a serialization error. (The same for an actually updated row of course,
butthat happened before too). 

While this part of the patch was fairly straight forward, make FOR SHARE conflict too seems to be much harder. The
assumptionthat a lock becomes irrelevant after the transaction(s) that held it completely is built deeply into the
multixact machinery that powers SHARE locks. That machinery therefore assumes that once all members of a multi xact
havecompleted the multi xact is dead also. But my proposal depends on a SERIALIZABLE transaction being able to find if
anyof the lockers of a row are invisible under it's snapshot - for which it'd need any multi xact containing invisible
xidsto outlive its snapshot. 

best regards,
Florian Pflug



Вложения

Re: Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle

От
Robert Haas
Дата:
On Sun, May 16, 2010 at 9:07 PM, Florian Pflug <fgp@phlo.org> wrote:
> On May 14, 2010, at 22:54 , Robert Haas wrote:
>> On Thu, May 13, 2010 at 5:39 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Florian Pflug <fgp@phlo.org> writes:
>>>> All in all, I believe that SHARE and UPDATE row-level locks should be
>>>> changed to cause concurrent UPDATEs to fail with a serialization
>>>> error.
>>>
>>> I don't see an argument for doing that for FOR SHARE locks, and it
>>> already happens for FOR UPDATE (at least if the row actually gets
>>> updated).  AFAICS this proposal mainly breaks things, in pursuit of
>>> an unnecessary and probably-impossible-anyway goal of making FK locking
>>> work with only user-level snapshots.
>>
>> After giving this considerable thought and testing the behavior at
>> some length, I think the OP has it right.  One thing I sometimes need
>> to do is denormalize a copy of a field, e.g.
>>
>> <snipped example>
>
> I've whipped up a quick and still rather dirty patch that implements the behavior I proposed, at least for the case
ofconflicts between FOR UPDATE locks and updates. With the patch, any attempt to UPDATE or FOR UPDATE lock a row that
hasconcurrently been FOR UPDATE locked will cause a serialization error. (The same for an actually updated row of
course,but that happened before too). 
>
> While this part of the patch was fairly straight forward, make FOR SHARE conflict too seems to be much harder. The
assumptionthat a lock becomes irrelevant after the transaction(s) that held it completely is built deeply into the
multixact machinery that powers SHARE locks. That machinery therefore assumes that once all members of a multi xact
havecompleted the multi xact is dead also. But my proposal depends on a SERIALIZABLE transaction being able to find if
anyof the lockers of a row are invisible under it's snapshot - for which it'd need any multi xact containing invisible
xidsto outlive its snapshot. 

Thanks for putting this together. I suggest adding it to the open
CommitFest - even if we decide to go forward with this, I don't
imagine anyone is going to be excited about changing it during beta.

https://commitfest.postgresql.org/action/commitfest_view/open

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle

От
Florian Pflug
Дата:
On May 17, 2010, at 3:30 , Robert Haas wrote:
> On Sun, May 16, 2010 at 9:07 PM, Florian Pflug <fgp@phlo.org> wrote:
>> On May 14, 2010, at 22:54 , Robert Haas wrote:
>>> On Thu, May 13, 2010 at 5:39 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>> Florian Pflug <fgp@phlo.org> writes:
>>>>> All in all, I believe that SHARE and UPDATE row-level locks should be
>>>>> changed to cause concurrent UPDATEs to fail with a serialization
>>>>> error.
>>>>
>>>> I don't see an argument for doing that for FOR SHARE locks, and it
>>>> already happens for FOR UPDATE (at least if the row actually gets
>>>> updated).  AFAICS this proposal mainly breaks things, in pursuit of
>>>> an unnecessary and probably-impossible-anyway goal of making FK locking
>>>> work with only user-level snapshots.
>>>
>>> After giving this considerable thought and testing the behavior at
>>> some length, I think the OP has it right.  One thing I sometimes need
>>> to do is denormalize a copy of a field, e.g.
>>>
>>> <snipped example>
>>
>> I've whipped up a quick and still rather dirty patch that implements the behavior I proposed, at least for the case
ofconflicts between FOR UPDATE locks and updates. With the patch, any attempt to UPDATE or FOR UPDATE lock a row that
hasconcurrently been FOR UPDATE locked will cause a serialization error. (The same for an actually updated row of
course,but that happened before too). 
>>
>> While this part of the patch was fairly straight forward, make FOR SHARE conflict too seems to be much harder. The
assumptionthat a lock becomes irrelevant after the transaction(s) that held it completely is built deeply into the
multixact machinery that powers SHARE locks. That machinery therefore assumes that once all members of a multi xact
havecompleted the multi xact is dead also. But my proposal depends on a SERIALIZABLE transaction being able to find if
anyof the lockers of a row are invisible under it's snapshot - for which it'd need any multi xact containing invisible
xidsto outlive its snapshot. 
>
> Thanks for putting this together. I suggest adding it to the open
> CommitFest - even if we decide to go forward with this, I don't
> imagine anyone is going to be excited about changing it during beta.
>
> https://commitfest.postgresql.org/action/commitfest_view/open


Will do. Thanks for the link.

Here is an updated version that works for SHARE locks too.

(This message mainly serves as a way to link the updated patch to the commit fest entry. Is this how I'm supposed to do
that,or am I doing something wrong?) 

best regards,
Florian Pflug

Вложения

Re: Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle

От
Robert Haas
Дата:
On Tue, May 18, 2010 at 8:15 PM, Florian Pflug <fgp@phlo.org> wrote:
> Will do. Thanks for the link.
>
> Here is an updated version that works for SHARE locks too.
>
> (This message mainly serves as a way to link the updated patch to the commit fest entry. Is this how I'm supposed to
dothat, or am I doing something wrong?)
 

Yeah - just go to the existing CF entry and say "New Comment" then
select type "Patch".

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle

От
Florian Pflug
Дата:
On May 19, 2010, at 2:15 , Florian Pflug wrote:
> On May 17, 2010, at 3:30 , Robert Haas wrote:
>> On Sun, May 16, 2010 at 9:07 PM, Florian Pflug <fgp@phlo.org> wrote:
>>> On May 14, 2010, at 22:54 , Robert Haas wrote:
>>>> On Thu, May 13, 2010 at 5:39 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>>> Florian Pflug <fgp@phlo.org> writes:
>>>>>> All in all, I believe that SHARE and UPDATE row-level locks should be
>>>>>> changed to cause concurrent UPDATEs to fail with a serialization
>>>>>> error.
>>>>>
>>>>> I don't see an argument for doing that for FOR SHARE locks, and it
>>>>> already happens for FOR UPDATE (at least if the row actually gets
>>>>> updated).  AFAICS this proposal mainly breaks things, in pursuit of
>>>>> an unnecessary and probably-impossible-anyway goal of making FK locking
>>>>> work with only user-level snapshots.
>>>>
>>>> After giving this considerable thought and testing the behavior at
>>>> some length, I think the OP has it right.  One thing I sometimes need
>>>> to do is denormalize a copy of a field, e.g.
>>>>
>>>> <snipped example>
>>>
>>> I've whipped up a quick and still rather dirty patch that implements the behavior I proposed, at least for the case
ofconflicts between FOR UPDATE locks and updates. With the patch, any attempt to UPDATE or FOR UPDATE lock a row that
hasconcurrently been FOR UPDATE locked will cause a serialization error. (The same for an actually updated row of
course,but that happened before too). 
>>>
>>> While this part of the patch was fairly straight forward, make FOR SHARE conflict too seems to be much harder. The
assumptionthat a lock becomes irrelevant after the transaction(s) that held it completely is built deeply into the
multixact machinery that powers SHARE locks. That machinery therefore assumes that once all members of a multi xact
havecompleted the multi xact is dead also. But my proposal depends on a SERIALIZABLE transaction being able to find if
anyof the lockers of a row are invisible under it's snapshot - for which it'd need any multi xact containing invisible
xidsto outlive its snapshot. 
>>
>> Thanks for putting this together. I suggest adding it to the open
>> CommitFest - even if we decide to go forward with this, I don't
>> imagine anyone is going to be excited about changing it during beta.
>>
>> https://commitfest.postgresql.org/action/commitfest_view/open
>
>
> Will do. Thanks for the link.
>
> Here is an updated version that works for SHARE locks too.

Forgetting to run "make check" before sending a patch is bad, as I just proved :-(

For the archives' and the commitfest app's sake, here is a version that actually passes the regression tests.

To make up for it, I also did some testing with a custom pgbench script & schema and proved the effectiveness of this
patch.I ran this with "pgbench  -s 10 -j 10 -c 10 -t 1000 -n -f fkbench.pgbench" on both HEAD and HEAD+patch. The
formererrors out quickly with "database inconsistent" while the later completes the pgbench run without errors.  

The patch still needs more work, at least on the comments & documentation side of things, but I'm going to let this
restnow while we're in beta. 

Patch, pgbench script and schema attached.


best regards,
Florian Pflug


Вложения

Re: Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle

От
Florian Pflug
Дата:
On May 21, 2010, at 4:20 , Florian Pflug wrote:
> On May 19, 2010, at 2:15 , Florian Pflug wrote:
>> On May 17, 2010, at 3:30 , Robert Haas wrote:
>>> On Sun, May 16, 2010 at 9:07 PM, Florian Pflug <fgp@phlo.org> wrote:
>>>> On May 14, 2010, at 22:54 , Robert Haas wrote:
>>>>> On Thu, May 13, 2010 at 5:39 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>>>> Florian Pflug <fgp@phlo.org> writes:
>>>>>>> All in all, I believe that SHARE and UPDATE row-level locks should be
>>>>>>> changed to cause concurrent UPDATEs to fail with a serialization
>>>>>>> error.
>>>>>>
>>>>>> I don't see an argument for doing that for FOR SHARE locks, and it
>>>>>> already happens for FOR UPDATE (at least if the row actually gets
>>>>>> updated).  AFAICS this proposal mainly breaks things, in pursuit of
>>>>>> an unnecessary and probably-impossible-anyway goal of making FK locking
>>>>>> work with only user-level snapshots.
>>>>>
>>>>> After giving this considerable thought and testing the behavior at
>>>>> some length, I think the OP has it right.  One thing I sometimes need
>>>>> to do is denormalize a copy of a field, e.g.
>>>>>
>>>>> <snipped example>
>>>>
>>>> I've whipped up a quick and still rather dirty patch that implements the behavior I proposed, at least for the
caseof conflicts between FOR UPDATE locks and updates. With the patch, any attempt to UPDATE or FOR UPDATE lock a row
thathas concurrently been FOR UPDATE locked will cause a serialization error. (The same for an actually updated row of
course,but that happened before too). 
>>>>
>>>> While this part of the patch was fairly straight forward, make FOR SHARE conflict too seems to be much harder. The
assumptionthat a lock becomes irrelevant after the transaction(s) that held it completely is built deeply into the
multixact machinery that powers SHARE locks. That machinery therefore assumes that once all members of a multi xact
havecompleted the multi xact is dead also. But my proposal depends on a SERIALIZABLE transaction being able to find if
anyof the lockers of a row are invisible under it's snapshot - for which it'd need any multi xact containing invisible
xidsto outlive its snapshot. 
>>>
>>> Thanks for putting this together. I suggest adding it to the open
>>> CommitFest - even if we decide to go forward with this, I don't
>>> imagine anyone is going to be excited about changing it during beta.
>>>
>>> https://commitfest.postgresql.org/action/commitfest_view/open
>>
>>
>> Will do. Thanks for the link.
>>
>> Here is an updated version that works for SHARE locks too.
>
> Forgetting to run "make check" before sending a patch is bad, as I just proved :-(
>
> For the archives' and the commitfest app's sake, here is a version that actually passes the regression tests.
>
> To make up for it, I also did some testing with a custom pgbench script & schema and proved the effectiveness of this
patch.I ran this with "pgbench  -s 10 -j 10 -c 10 -t 1000 -n -f fkbench.pgbench" on both HEAD and HEAD+patch. The
formererrors out quickly with "database inconsistent" while the later completes the pgbench run without errors.  
>
> The patch still needs more work, at least on the comments & documentation side of things, but I'm going to let this
restnow while we're in beta. 
>
> Patch, pgbench script and schema attached.

Great, now my mail client decided to send encode those attachments with MacBinary instead of sending them as plain text
:-(

Not sure if MUAs other than Mail.app can open those, so I'm resending this. Really sorry for the noise, guys

best regards,
Florian Pflug


Вложения

Re: Review: Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle

От
"Kevin Grittner"
Дата:
=================
Submission review
=================
* Is the patch in context diff format?
Yes.
* Does it apply cleanly to the current CVS HEAD?
Yes.
* Does it include reasonable tests, necessary doc patches, etc? 
There is one pgbench test which shows incorrect behavior without the
patch and correct behavior with the patch for a significant use
case.
Documentation changes are needed in the "Concurrency Control"
chapter.
================
Usability review
================

* Does the patch actually implement that?
Yes.
* Do we want that?
Yes.  We seem to have reached consensus on the -hackers list to that
effect.  On a personal note, I heard some current Oracle users who
were considering a switch to PostgreSQL grumbling after Robert's
trigger presentation at PostgreSQL Conference U.S. East about how
they didn't need to use such complex coding techniques to ensure
data integrity in Oracle as is required in PostgreSQL.  I was
surprised, since I know that they also get snapshot isolation when
they request serializable, but they explained that SELECT FOR UPDATE
provides stronger guarantees in Oracle.  This patch should provide
equivalent behavior, which should ease migration from Oracle and
allow simpler coding techniques in snapshot isolation to protect
data.
* Do we already have it?
No.
* Does it follow SQL spec, or the community-agreed behavior?
It's not in the spec, but it moves to safer behavior which is
consistent with the current Oracle implementation.
* Does it include pg_dump support (if applicable)?
Not applicable.
* Are there dangers?
Some code which continues after blocking will now get a
serialization failure.  It's possible that this could cause problems
for some existing code, although that code was likely either using
SELECT FOR UPDATE unnecessarily or was unsafe without this patch.
* Have all the bases been covered? 
As far as I can see.
============
Feature test
============

* Does the feature work as advertised?
Yes.
* Are there corner cases the author has failed to consider?
Not that I found.
* Are there any assertion failures or crashes?
No.
==================
Performance review
==================
* Does the patch slow down simple tests?
No.
* If it claims to improve performance, does it?
It makes no such claim.
* Does it slow down other things?
No.
=============
Coding review
=============
* Does it follow the project coding guidelines?
Comments are not all in standard style.
In some cases there are unnecessary braces around a single statement
for an *if* or *else*.
There are function where the last two parameters were changed from: Snapshot crosscheck, bool wait
to: bool wait, Snapshot lockcheck_snapshot
It appears to be so that the semantic change to the use of the
snapshot doesn't break code at runtime without forcing the
programmer to notice the change based on compile errors, which seems
like a Good Thing.
* Are there portability issues?
No.
* Will it work on Windows/BSD etc?
Yes.
* Are the comments sufficient and accurate?
Given that there is a significant behavioral change, it seems as
though there could be a sentence or two somewhere concisely stating
the how things behave, but I'm not sure quite where it would go. 
Perhaps something in the README file in the access/transam
directory?
* Does it do what it says, correctly?
Yes.
* Does it produce compiler warnings?
No.
* Can you make it crash?
No.
===================
Architecture review
===================
* Is everything done in a way that fits together coherently with other features/modules?
I think so.
* Are there interdependencies that can cause problems?
Not that I could identify with committed code.
I was concerned about its interaction with the other serializable
patch (by myself and Dan Ports), so I also combined the patches and
tested.  Florian's pgbench test did expose bugs in the *other*
patch, which I then fixed in the combined setting.  There was still
some breakage in the other patch when Florian's patch was backed
out, so at the moment, this patch would appear to be a
*prerequisite* for the other.  (That can hopefully be corrected
by changes to the other patch.)
Also, I'm attempting to adapt the dcheck tests for the other patch
to work with this patch.  If successful, I'll post with the results
of that additional testing.



Re: Review: Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle

От
Joe Conway
Дата:
On 07/17/2010 09:25 AM, Kevin Grittner wrote:
> I was concerned about its interaction with the other serializable
> patch (by myself and Dan Ports), so I also combined the patches and
> tested.  Florian's pgbench test did expose bugs in the *other*
> patch, which I then fixed in the combined setting.  There was still
> some breakage in the other patch when Florian's patch was backed
> out, so at the moment, this patch would appear to be a
> *prerequisite* for the other.  (That can hopefully be corrected
> by changes to the other patch.)

I am currently reading the Cahill paper [1] in preparation for reviewing
your patch [2]. Should I be installing Florian's patch in addition to
yours when I start testing? Also, where can I get the latest and
greatest version of your patch?

Thanks,

Joe

[1]
http://ses.library.usyd.edu.au/bitstream/2123/5353/1/michael-cahill-2009-thesis.pdf

[2] https://commitfest.postgresql.org/action/patch_view?id=310

--
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting, & Support


Re: Review: Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle

От
Florian Pflug
Дата:
On Jul17, 2010, at 18:25 , Kevin Grittner wrote:
> * Does it include reasonable tests, necessary doc patches, etc?
>
> Documentation changes are needed in the "Concurrency Control"
> chapter.
>
> <...>
>
> * Do we want that?
>
> Yes.  We seem to have reached consensus on the -hackers list to that
> effect.
I've kinda waited for the latter before putting work into the former, so I guess I should get going.

> * Does the patch slow down simple tests?
The part that I'm slightly nervous about regarding performance regressions is the call of MultiXactIdSetOldestVisible()
Ihad to add to GetTransactionSnapshot() (Though only for serializable transactions). That could increase the pressure
onthe multi xact machinery since for some workloads since it will increase the time a multi xact stays alive. I don't
seea way around that, though, since the patch depends on being able to find multi xact members which are invisible to a
serializabletransaction's snapshot. 

> * Does it follow the project coding guidelines?
>
> Comments are not all in standard style.
Does that refer to the language used, or to the formatting?

> In some cases there are unnecessary braces around a single statement
> for an *if* or *else*.
Will fix.

> There are function where the last two parameters were changed from:
>
>  Snapshot crosscheck, bool wait
>
> to:
>
>  bool wait, Snapshot lockcheck_snapshot
>
> It appears to be so that the semantic change to the use of the
> snapshot doesn't break code at runtime without forcing the
> programmer to notice the change based on compile errors, which seems
> like a Good Thing.
Yeah, that was the idea.

Btw, while the patch obsoletes the crosscheck snapshot, it currently doesn't remove its traces of it throughout the
executorand the ri triggers. Mainly because I felt doing so would make forward-porting and reviewing harder without any
gain.But ultimately, those traces should probably all go, unless someone feels that for some #ifdef NOT_USED is
preferable. 

> * Are the comments sufficient and accurate?
>
> Given that there is a significant behavioral change, it seems as
> though there could be a sentence or two somewhere concisely stating
> the how things behave, but I'm not sure quite where it would go.
> Perhaps something in the README file in the access/transam
> directory?
Hm, I'll try to come up with something.

> * Are there interdependencies that can cause problems?
>
> Also, I'm attempting to adapt the dcheck tests for the other patch
> to work with this patch.  If successful, I'll post with the results
> of that additional testing.

Cool! And thanks for the work you put into reviewing this! I'll update the documentation and comments ASAP. Probably
notwithin the next few days though, since I'm unfortunately quite busy at the moment, trying to finally complete my
master'sthesis. 

best regards,
Florian Pflug



Re: Review: Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle

От
"Kevin Grittner"
Дата:
Florian Pflug <fgp@phlo.org> wrote:
> On Jul17, 2010, at 18:25 , Kevin Grittner wrote:
>> * Does it follow the project coding guidelines?
>> 
>> Comments are not all in standard style.
> Does that refer to the language used, or to the formatting?
Formatting.  Comment style seems to be defined here:
http://developer.postgresql.org/pgdocs/postgres/source-format.html
as being:
/** comment text begins here* and continues here*/
You have these formats in your patch:
/* comment text begins here* and continues here*/
/* comment text begins here  and continues here */
/* One line comment like this. */
That last one is actually pretty common in PostgreSQL source, so I'm
not sure that its omission from the style page isn't accidental.
> Btw, while the patch obsoletes the crosscheck snapshot, it
> currently doesn't remove its traces of it throughout the executor
> and the ri triggers. Mainly because I felt doing so would make
> forward-porting and reviewing harder without any gain. But
> ultimately, those traces should probably all go, unless someone
> feels that for some #ifdef NOT_USED is preferable.
My view is that we have a revision control system which makes the
code easy to restore, should it be found to be useful again.  If it
has no use at the point of applying this patch, my inclination would
be to delete it.  If you're particularly concerned that it could
later be useful, you might want to do that deletion in as a separate
patch to facilitate later resurrection of the code.
-Kevin


Re: Review: Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle

От
Andrew Dunstan
Дата:

Kevin Grittner wrote:
> Comment style seems to be defined here:
>  
> http://developer.postgresql.org/pgdocs/postgres/source-format.html
>  
> as being:
>  
> /*
>  * comment text begins here
>  * and continues here
>  */
>  
> You have these formats in your patch:
>  
> /* comment text begins here
>  * and continues here
>  */
>  
> /* comment text begins here
>    and continues here */
>  
> /* One line comment like this. */
>  
> That last one is actually pretty common in PostgreSQL source, so I'm
> not sure that its omission from the style page isn't accidental.
>  
>   
>   

The style doc talks about a standard for multi-line comments - it 
doesn't forbid single line comments.

cheers

andrew


Re: Review: Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle

От
Florian Pflug
Дата:
Hi

I've updated mvcc.sgml to explain the new serialization conflict rules for row-level locks, and added a paragraph to
backend/executor/READMEthat explains the implementation of those. I've chosen backend/executor/README because it
alreadycontains a description of UPDATE handling in READ COMMITTED mode, which seemed at least somewhat related. 

The patch now removes all code dealing with the crosscheck_snapshot, since the RI triggers should now be safe without
that.

I've also fixed the formatting of multi-line comments to match the coding style. I kept the braces around single-line
"if"or "else" statements wherever both "if" and "else" blocks were present and one of them wasn't a single-line block. 

I think, in addition to the documentation changes this patch contains, that a section on how to write RI triggers in
pl/pgsqlwould be nice to have. It's not strictly part of the documentation of this feature though, more a potential
use-case,so I didn't tackle it for the moment. I do hope to find the time to write such a section, though, and if that
happensI'll post it as a separate documentation-only patch. 

I've pushed the changes to the branch serializable_row_locks on git://github.com/fgp/postgres.git and also attached a
patch.

best regards,
Florian Pflug

Вложения

Re: Review: Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle

От
Florian Pflug
Дата:
On Aug3, 2010, at 00:43 , Florian Pflug wrote:
> Sounds good. That'll also give me some time to test the RI trigger
> infrastructure now that I've removed the crosscheck code.

Ok, I've found some time do run some additional tests.

I've created a small test suite to compare the behaviour of native (cascading) FK constraints to an PLPGSQL
implementation.There is a dedicated child table (native_child respectively plpgsql_child) for each of the two FK
implementation.Into both child rows for random parents are inserted, creating the parent if it does not already exists.
Concurrently,random parent rows are deleted. 

The number of successful inserts into native_child respectively plpgsql_child and the number of successfull deletes are
counted,as well as the number of transactions failed due to either a serialization failure or a race condition during
theinsert (unique_violation or foreign_key_violation). 

To verify that things behave as expected, the test suite reports a histogram of the number of parents over the child
countafter the test completes. The expected distribution is output alongside that histogram, assuming that the number
ofparents with N children follows an exponential distribution. 

I've pushed the test suite togit@github.com:fgp/fk_concurrency.git
and put a summary of the results of my test runs onhttp://wiki.github.com/fgp/fk_concurrency/results.

The results look good. There is no significant change in the behaviour of FKs between HEAD and
HEAD+serializable_row_locks,and no significant difference between the native and plpgsql implementation. As expected,
theplpgsql implementation fails on an unpatched HEAD, aborting with the error "database inconsistent" pretty quickly. 

I'd be interesting to run the plpgsql implementation with the SHARE locking removed against HEAD+true_serializability
tosee if that changes the number of serialization_failures that occur. I'll do that, but I'll have to wait a bit for
anotherspare time window to open. 

best regards,
Florian Pflug