Обсуждение: CHECK constraints with plpgsql functions - check 'fires' BEFORE data modification?
CHECK constraints with plpgsql functions - check 'fires' BEFORE data modification?
От
Mario Splivalo
Дата:
The 'data integrity' rule for database I'm designing says that any subject we're tracking (persons, companies, whatever) is assigned an agreement that can be in several states: 'Approved', 'Unapproved' or 'Obsolete'. One subject can have only one (or none) 'Approved' or 'Unapproved' agreement, and as many (or none) 'Obsolete' agreements. I was thinking on employing the CHECK constraint on agreements table that would check that there is only one 'Approved' state per subject. My (simplified) schema looks like this: CREATE TYPE enum_agreement_state AS ENUM ('unapproved', 'approved', 'obsolete'); CREATE TABLE subjects ( subject_id serial NOT NULL, subject_name character varying NOT NULL, CONSTRAINT subjects_pkey PRIMARY KEY (subject_id) ); CREATE TABLE agreements ( agreement_id serial NOT NULL, subject_id integer NOT NULL, agreement_state enum_agreement_state NOT NULL, CONSTRAINT agreements_pkeyPRIMARY KEY (agreement_id), CONSTRAINT agreements_subject_id_fkey FOREIGN KEY (subject_id) REFERENCESsubjects (subject_id) MATCH SIMPLE ON UPDATE NO ACTION ON DELETE NO ACTION, CONSTRAINT check_agreements_onlyone_approvedCHECK (check_agreements_onlyone_approved(subject_id)) ); CREATE FUNCTION check_agreements_onlyone_approved(a_subject_id integer) RETURNS boolean AS $$ SELECTCASE COUNT(agreement_id) WHEN 0 THEN true WHEN 1 THEN true ELSE falseEND FROM agreements WHERE subject_id= $1 AND agreement_state = 'approved'; $$ LANGUAGE 'sql'; Now, the above does not work because CHECK function is fired BEFORE actuall data modification takes place so I can end up with two rows with 'approved' state for particular subject_id. If I change the CASE...WHEN conditions so that function returns TRUE only when there is 0 rows for the state = 'approved' then I have problems with UPDATEing: UPDATE agreements SET agreement_state = 'obsolete' where subject_id = <whatever> AND agreement_state = 'approved' That update will fail because the CHECK function is fired before the actuall update, and there is allready a row with state = 'approved' in the table. Now, I know I could use triggers to achieve desired functionality but I try to use triggers as seldom as I can. Often ETL scripts disable triggers so I could end up with data integrity broken. The 'proper' way to do this (as suggested by earlier posts on this mailing list) is to use partial UNIQUE indexes, but I have problem with that too: indexes are not part of DDL (no matter that primary key constraints and/or unique constraints use indexes to employ those constraints), and as far as I know there is no 'partial unique constraint' in SQL? Does anyone has better suggestion on how to employ the data-integrity rules I have? And, wouldn't it be better to have CHECK constraints check the data AFTER data-modification? I also found no reference on using CHECK constraints with user-defined functions on postgres manual - there should be a mention of the way the CHECK constraint works - that is, function referenced by CHECK constraint is fired BEFORE the actual data modification occur. The error message is also misleading, for instance, when I run the before mentioned UPDATE: constraint_check=# update agreements set agreement_state = 'obsolete' where subject_id = 1 and agreement_state = 'approved'; ERROR: new row for relation "agreements" violates check constraint "check_agreements_onlyone_approved" Mario
On 2010-04-19, Mario Splivalo <mario.splivalo@megafon.hr> wrote: > The 'data integrity' rule for database I'm designing says that any > subject we're tracking (persons, companies, whatever) is assigned an > agreement that can be in several states: 'Approved', 'Unapproved' or > 'Obsolete'. One subject can have only one (or none) 'Approved' or > 'Unapproved' agreement, and as many (or none) 'Obsolete' agreements. ... > The 'proper' way to do this (as suggested by earlier posts on this > mailing list) is to use partial UNIQUE indexes, but I have problem with > that too: indexes are not part of DDL (no matter that primary key > constraints and/or unique constraints use indexes to employ those > constraints), and as far as I know there is no 'partial unique > constraint' in SQL? huh? create unique index agreements_approved_onlyone on agreements(subject_id) where agreement_state='approved'; > And, wouldn't it be better to have CHECK constraints check the data > AFTER data-modification? no. > CREATE FUNCTION check_agreements_onlyone_approved(a_subject_id integer) > RETURNS boolean AS > $$ > SELECT > CASE COUNT(agreement_id) > WHEN 0 THEN true > WHEN 1 THEN true > ELSE false > END FROM agreements WHERE subject_id = $1 AND agreement_state = 'approved'; > $$ LANGUAGE 'sql'; > > Now, the above does not work because CHECK function is fired BEFORE > actuall data modification takes place so I can end up with two rows with > 'approved' state for particular subject_id. If I change the CASE...WHEN > conditions so that function returns TRUE only when there is 0 rows for > the state = 'approved' then I have problems with UPDATEing: > > UPDATE agreements SET agreement_state = 'obsolete' where subject_id = ><whatever> AND agreement_state = 'approved' > > That update will fail because the CHECK function is fired before the > actuall update, and there is allready a row with state = 'approved' in > the table. fix the check so that it knows what the new state will be. then it test if the proposed new state is compatible with the old state. (but seriously, first explain why the index doesn't work) ... CONSTRAINT check_agreements_onlyone_approved CHECK (check_agreements_onlyone_approved(subject_id,agreement_id,agreement_state))); CREATE OR REPLACE FUNCTION check_agreements_onlyone_approved(the_subject_id integer, the_pkey integer, the_new_state enum_agreement_state)RETURNS boolean AS$$SELECT CASE COUNT(agreement_id) WHEN 0 THEN true WHEN 1 THEN $3 != 'approved' ELSE false END FROM agreementsWHERE subject_id = $1 AND agreement_state = 'approved' and agreement_id != $2 $$ LANGUAGE 'sql'; still not perfect: if you need to change the agreement_id this will block you from doing that on approved agreements. Newsgroups: gmane.comp.db.postgresql.sql From: Jasen Betts <jasen@xnet.co.nz> Subject: Re: CHECK constraints with plpgsql functions - check 'fires' BEFORE data modification? References: <4BCC272C.3020505@megafon.hr> Organization: Dis (not Dat) Organisation Followup-To: X-Face: ?)Aw4rXwN5u0~$nqKj`xPz>xHCwgi^q+^?Ri*+R(&uv2=E1Q0Zk(>h!~o2ID@6{uf8s;a+M[5[U[QT7xFN%^gR"=tuJw%TXXR'Fp~W;(T"1(739R%m0Yyyv*gkGoPA.$b,D.w:z+<'"=-lVT?6{T?=R^:W5g|E2#EhjKCa+nt":4b}dU7GYB*HBxn&Td$@f%.kl^:7X8rQWd[NTc"P"u6nkisze/Q;8"9Z{peQF,w)7UjV$c|RO/mQW/NMgWfr5*$-Z%u46"/00mx-,\R'fLPe.)^ On 2010-04-19, Mario Splivalo <mario.splivalo@megafon.hr> wrote: > The 'data integrity' rule for database I'm designing says that any > subject we're tracking (persons, companies, whatever) is assigned an > agreement that can be in several states: 'Approved', 'Unapproved' or > 'Obsolete'. One subject can have only one (or none) 'Approved' or > 'Unapproved' agreement, and as many (or none) 'Obsolete' agreements. > > I was thinking on employing the CHECK constraint on agreements table > that would check that there is only one 'Approved' state per subject. > > My (simplified) schema looks like this: > > CREATE TYPE enum_agreement_state AS ENUM > ('unapproved', > 'approved', > 'obsolete'); > > CREATE TABLE subjects > ( > subject_id serial NOT NULL, > subject_name character varying NOT NULL, > CONSTRAINT subjects_pkey PRIMARY KEY (subject_id) > ); > > CREATE TABLE agreements > ( > agreement_id serial NOT NULL, > subject_id integer NOT NULL, > agreement_state enum_agreement_state NOT NULL, > CONSTRAINT agreements_pkey PRIMARY KEY (agreement_id), > CONSTRAINT agreements_subject_id_fkey FOREIGN KEY (subject_id) > REFERENCES subjects (subject_id) MATCH SIMPLE > ON UPDATE NO ACTION ON DELETE NO ACTION, > CONSTRAINT check_agreements_onlyone_approved CHECK > (check_agreements_onlyone_approved(subject_id)) > ); > > CREATE FUNCTION check_agreements_onlyone_approved(a_subject_id integer) > RETURNS boolean AS > $$ > SELECT > CASE COUNT(agreement_id) > WHEN 0 THEN true > WHEN 1 THEN true > ELSE false > END FROM agreements WHERE subject_id = $1 AND agreement_state = 'approved'; > $$ LANGUAGE 'sql'; > > Now, the above does not work because CHECK function is fired BEFORE > actuall data modification takes place so I can end up with two rows with > 'approved' state for particular subject_id. If I change the CASE...WHEN > conditions so that function returns TRUE only when there is 0 rows for > the state = 'approved' then I have problems with UPDATEing: > > UPDATE agreements SET agreement_state = 'obsolete' where subject_id = ><whatever> AND agreement_state = 'approved' > > That update will fail because the CHECK function is fired before the > actuall update, and there is allready a row with state = 'approved' in > the table. > > Now, I know I could use triggers to achieve desired functionality but I > try to use triggers as seldom as I can. Often ETL scripts disable > triggers so I could end up with data integrity broken. > > The 'proper' way to do this (as suggested by earlier posts on this > mailing list) is to use partial UNIQUE indexes, but I have problem with > that too: indexes are not part of DDL (no matter that primary key > constraints and/or unique constraints use indexes to employ those > constraints), and as far as I know there is no 'partial unique > constraint' in SQL? > > Does anyone has better suggestion on how to employ the data-integrity > rules I have? > > And, wouldn't it be better to have CHECK constraints check the data > AFTER data-modification? I also found no reference on using CHECK > constraints with user-defined functions on postgres manual - there > should be a mention of the way the CHECK constraint works - that is, > function referenced by CHECK constraint is fired BEFORE the actual data > modification occur. The error message is also misleading, for instance, > when I run the before mentioned UPDATE: > > constraint_check=# update agreements set agreement_state = 'obsolete' > where subject_id = 1 and agreement_state = 'approved'; > ERROR: new row for relation "agreements" violates check constraint > "check_agreements_onlyone_approved" > > Mario >
Re: Re: CHECK constraints with plpgsql functions - check 'fires' BEFORE data modification?
От
Mario Splivalo
Дата:
Jasen Betts wrote: > ... > >> The 'proper' way to do this (as suggested by earlier posts on this >> mailing list) is to use partial UNIQUE indexes, but I have problem with >> that too: indexes are not part of DDL (no matter that primary key >> constraints and/or unique constraints use indexes to employ those >> constraints), and as far as I know there is no 'partial unique >> constraint' in SQL? > > huh? > > create unique index agreements_approved_onlyone on agreements(subject_id) > where agreement_state='approved'; That will create unique index - the way I'm doing it now. The difference is just semantic - indexes are not part of DDL, they are RDBMS's aids on improving database performance. >> And, wouldn't it be better to have CHECK constraints check the data >> AFTER data-modification? > > no. Why not? >> Now, the above does not work because CHECK function is fired BEFORE >> actuall data modification takes place so I can end up with two rows with >> 'approved' state for particular subject_id. If I change the CASE...WHEN >> conditions so that function returns TRUE only when there is 0 rows for >> the state = 'approved' then I have problems with UPDATEing: >> >> UPDATE agreements SET agreement_state = 'obsolete' where subject_id = >> <whatever> AND agreement_state = 'approved' >> >> That update will fail because the CHECK function is fired before the >> actuall update, and there is allready a row with state = 'approved' in >> the table. > > fix the check so that it knows what the new state will be. then it test > if the proposed new state is compatible with the old state. Yes, but for that I need to know weather DML command is INSERT or UPDATE, and function called by check constraint has no way of knowing that. Yes, I know that I could use trigger, even constraint trigger, but I prefer not to use triggers if I don't have to. > (but seriously, first explain why the index doesn't work) It does work, and the more I look at it that seems to be the only proper way of doing what I want. But, as I've said, indexes are not DDL, and strictly my data-integrity rule needs to be done within DDL, not indexes. Since there is no other way I will, of course, use indexes. (Similarly I'd encourage one to use UNIQUE CONSTRAINT over UNIQUE INDEX because CONSTRAINTS are part of DDL, while indexes are not. It doesn't matter that postgres, and any other RDBMS for that matter, will empower UNIQUE CONSTRAINT using indexes.) > > ... > CONSTRAINT check_agreements_onlyone_approved CHECK > (check_agreements_onlyone_approved(subject_id,agreement_id,agreement_state)) > ); > > CREATE OR REPLACE FUNCTION check_agreements_onlyone_approved(the_subject_id > integer, the_pkey integer, the_new_state enum_agreement_state) > RETURNS boolean AS > $$ > SELECT > CASE COUNT(agreement_id) > WHEN 0 THEN true > WHEN 1 THEN $3 != 'approved' > ELSE false > END FROM agreements WHERE subject_id = $1 AND agreement_state > = 'approved' and agreement_id != $2 > $$ LANGUAGE 'sql'; > > still not perfect: if you need to change the agreement_id this will > block you from doing that on approved agreements. Yes, this seems more and more as a bad idea :) Mario
Jasen Betts <jasen@xnet.co.nz> writes: > On 2010-04-19, Mario Splivalo <mario.splivalo@megafon.hr> wrote: >> The 'proper' way to do this (as suggested by earlier posts on this >> mailing list) is to use partial UNIQUE indexes, but I have problem with >> that too: indexes are not part of DDL (no matter that primary key >> constraints and/or unique constraints use indexes to employ those >> constraints), and as far as I know there is no 'partial unique >> constraint' in SQL? > huh? I think what Mario is actually complaining about is that partial unique indexes are not part of the SQL standard, and he wants a solution that at least gives the illusion that it might be portable to some other RDBMS in the future. Unfortunately, an illusion is all it would be. Even presuming that the other DBMS lets you run plpgsql-equivalent functions in CHECK constraints, the whole approach is broken by concurrency considerations. If you have two transactions simultaneously inserting rows that would be valid given the prior state of the table, but it's *not* valid for them both to be present, then a CHECK or trigger-based constraint is going to fail, because neither transaction will see the other's uncommitted row. At least that's how it works in Postgres. In some other DBMS it might work differently, but you're right back up against the fact that your solution is not portable. Unique constraints (partial or otherwise) deal with the race-condition problem by doing low-level things that aren't exposed at the SQL level. So there's simply no way to get the equivalent behavior in pure standard SQL. >> And, wouldn't it be better to have CHECK constraints check the data >> AFTER data-modification? > no. Indeed. The race condition is still there. CHECK is meant to handle constraints on a row's value *in isolation*. If you try to use it to enforce cross-row conditions, the project will certainly end badly. regards, tom lane
Re: Re: CHECK constraints with plpgsql functions - check 'fires' BEFORE data modification?
От
Mario Splivalo
Дата:
Tom Lane wrote: > > I think what Mario is actually complaining about is that partial unique > indexes are not part of the SQL standard, and he wants a solution that > at least gives the illusion that it might be portable to some other > RDBMS in the future. Correct. As far as I can see there is no partial unique constraints defined within the standard, and there are check constraints, so... Although, especially after this: > Unfortunately, an illusion is all it would be. Even presuming that the > other DBMS lets you run plpgsql-equivalent functions in CHECK > constraints, the whole approach is broken by concurrency considerations. > If you have two transactions simultaneously inserting rows that would be > valid given the prior state of the table, but it's *not* valid for them > both to be present, then a CHECK or trigger-based constraint is going to > fail, because neither transaction will see the other's uncommitted row. > At least that's how it works in Postgres. In some other DBMS it might > work differently, but you're right back up against the fact that your > solution is not portable. > > Unique constraints (partial or otherwise) deal with the race-condition > problem by doing low-level things that aren't exposed at the SQL level. > So there's simply no way to get the equivalent behavior in pure standard > SQL. Thank you for the clarification, it's easy to understand now why using check constraints is a bad idea for the purpose I wanted to use them. It's also easy to see why 'proper' way is using partial unique indexes. Still, one has to wonder why there are no partial unique constraints defined in SQL standard :) Mario