Обсуждение: Atomicity of INSERT INTO ... SELECT ... WHERE NOT EXISTS ...
Given (pseudocode) CREATE TABLE kvstore ( k varchar primary key, v varchar); CREATE OR REPLACE FUNCTION store_key_value_pair(k varchar, v varchar) returns boolean as $$ BEGIN INSERT INTO kvstore (k, v) SELECT :k, :v WHERE NOT EXISTS (select 1 from kvstore where k = :k); RETURN FOUND; END; $$ LANGUAGE plpgsql; I have a few questions: 1) Does INSERT statement set FOUND based on whether or not the row was inserted? 2) If this is invoked without a transaction in progress, is there any guarantee of atomicity between checking the EXISTSand attempting to insert the row? If this is being executed in two (or more) sessions, can the SELECT succeed butthen have the INSERT fail with a duplicate-key exception? 3) Will the behavior be different if the invoking processes have a transaction in progress?
Jim Garrison wrote > Given (pseudocode) > > CREATE TABLE kvstore ( > k varchar primary key, > v varchar); > > CREATE OR REPLACE FUNCTION store_key_value_pair(k varchar, v varchar) > returns boolean as $$ > BEGIN > INSERT INTO kvstore (k, v) > SELECT :k, :v > WHERE NOT EXISTS (select 1 from kvstore where k = :k); > RETURN FOUND; > END; > $$ LANGUAGE plpgsql; > > I have a few questions: > > 1) Does INSERT statement set FOUND based on whether or not the row was > inserted? > 2) If this is invoked without a transaction in progress, is there any > guarantee of atomicity between checking the EXISTS and attempting to > insert the row? If this is being executed in two (or more) sessions, can > the SELECT succeed but then have the INSERT fail with a duplicate-key > exception? > 3) Will the behavior be different if the invoking processes have a > transaction in progress? 1) The top-level query controls FOUND; so yes 2) Impossible - functions always execute in a transaction. Actually, everything executes in a transaction the choice is whether you want to auto-commit. 3) see #2 Still not super fluent wrt concurrency reasoning but here it goes: Since we are dealing with MVCC here, and the default READ COMMITTED isolation level, the data that each session/statement would see would be stable and not include any INSERTs concurrently performed by the other. Thus if two sessions try to simultaneously insert the same (k,v) the one that commits second will error. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Atomicity-of-INSERT-INTO-SELECT-WHERE-NOT-EXISTS-tp5816655p5816668.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
Jim Garrison <jim.garrison@nwea.org> writes: > Given (pseudocode) > > CREATE TABLE kvstore ( > k varchar primary key, > v varchar); > > CREATE OR REPLACE FUNCTION store_key_value_pair(k varchar, v varchar) returns boolean as $$ > BEGIN > INSERT INTO kvstore (k, v) > SELECT :k, :v > WHERE NOT EXISTS (select 1 from kvstore where k = :k); > RETURN FOUND; > END; > $$ LANGUAGE plpgsql; > > I have a few questions: > > 1) Does INSERT statement set FOUND based on whether or not the row was inserted? Yes unless triggers/rules are in volved.. > 2) If this is invoked without a transaction in progress, is there any > guarantee of atomicity between checking the EXISTS and attempting to > insert the row? If this is being executed in two (or more) sessions, > can the SELECT succeed but then have the INSERT fail with a > duplicate-key exception? You will either be at risk of a race condition or more likely your insert will try the dupe insert and block waiting for another session that's already inserted same row to commit/abort. And if other guy does commit then you will raise a dupe key exception. Your code is wise trying to avoid a dupe insert if the row is already existing and visible but you will still need to trap the dupe key exception when you get one which will happen eventually if there will be other sessions trying this same insert. > 3) Will the behavior be different if the invoking processes have a > transaction in progress? Every statement you run is a transaction of its own but you are far more at risk of testing negative for an existing row, proceeding to try the insert but then hanging because there is the same insert already pending... if there are longer running complex transactions involved. Suppose... session A begin; insert into your table key=1 ... do more work here... meanwhile... session B test for row and I don't see it try insert and hang here till commit/abort of session A session A commit session B Doh!! dupe key error HTH -- Jerry Sievers Postgres DBA/Development Consulting e: postgres.consulting@comcast.net p: 312.241.7800
On 08/28/2014 06:22 AM, Jim Garrison wrote: > Given (pseudocode) > > CREATE TABLE kvstore ( > k varchar primary key, > v varchar); > > CREATE OR REPLACE FUNCTION store_key_value_pair(k varchar, v varchar) returns boolean as $$ > BEGIN > INSERT INTO kvstore (k, v) > SELECT :k, :v > WHERE NOT EXISTS (select 1 from kvstore where k = :k); > RETURN FOUND; > END; > $$ LANGUAGE plpgsql; > > I have a few questions: > > 1) Does INSERT statement set FOUND based on whether or not the row was inserted? > 2) If this is invoked without a transaction in progress, is there any guarantee of atomicity between checking the EXISTSand attempting to insert the row? If this is being executed in two (or more) sessions, can the SELECT succeed butthen have the INSERT fail with a duplicate-key exception? This code can still fail with a unique violation, yes, as the select can occur in both transactions then the insert in both. > 3) Will the behavior be different if the invoking processes have a transaction in progress? No, because all functions run in transactions. There is no such thing as "not in a transaction" in PostgreSQL (except for a few special system management commands). If it's in a SERIALIZABLE transaction instead of the default READ COMMITTED then it might fail with a serialization failure instead of a unique violation, but it'll still fail. Please read the detailed guidance on this problem that already exists: http://www.postgresql.org/docs/current/static/plpgsql-control-structures.html#PLPGSQL-UPSERT-EXAMPLE http://www.depesz.com/2012/06/10/why-is-upsert-so-complicated/ http://stackoverflow.com/questions/1109061/insert-on-duplicate-update-in-postgresql -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services