Обсуждение: CHECK constraints with plpgsql functions - check 'fires' BEFORE data modification?

Поиск
Список
Период
Сортировка

CHECK constraints with plpgsql functions - check 'fires' BEFORE data modification?

От
Mario Splivalo
Дата:
The 'data integrity' rule for database I'm designing says that any
subject we're tracking (persons, companies, whatever) is assigned an
agreement that can be in several states: 'Approved', 'Unapproved' or
'Obsolete'. One subject can have only one (or none) 'Approved' or
'Unapproved' agreement, and as many (or none) 'Obsolete' agreements.

I was thinking on employing the CHECK constraint on agreements table
that would check that there is only one 'Approved' state per subject.

My (simplified) schema looks like this:

CREATE TYPE enum_agreement_state AS ENUM  ('unapproved',   'approved',   'obsolete');

CREATE TABLE subjects
( subject_id serial NOT NULL, subject_name character varying NOT NULL, CONSTRAINT subjects_pkey PRIMARY KEY
(subject_id)
);

CREATE TABLE agreements
( agreement_id serial NOT NULL, subject_id integer NOT NULL, agreement_state enum_agreement_state NOT NULL, CONSTRAINT
agreements_pkeyPRIMARY KEY (agreement_id), CONSTRAINT agreements_subject_id_fkey FOREIGN KEY (subject_id)
REFERENCESsubjects (subject_id) MATCH SIMPLE     ON UPDATE NO ACTION ON DELETE NO ACTION, CONSTRAINT
check_agreements_onlyone_approvedCHECK
 
(check_agreements_onlyone_approved(subject_id))
);

CREATE FUNCTION check_agreements_onlyone_approved(a_subject_id integer)
RETURNS boolean AS
$$
SELECTCASE COUNT(agreement_id)    WHEN 0 THEN true    WHEN 1 THEN true    ELSE falseEND FROM agreements WHERE
subject_id= $1 AND agreement_state = 'approved';
 
$$ LANGUAGE 'sql';

Now, the above does not work because CHECK function is fired BEFORE
actuall data modification takes place so I can end up with two rows with
'approved' state for particular subject_id. If I change the CASE...WHEN
conditions so that function returns TRUE only when there is 0 rows for
the state = 'approved' then I have problems with UPDATEing:

UPDATE agreements SET agreement_state = 'obsolete' where subject_id =
<whatever> AND agreement_state = 'approved'

That update will fail because the CHECK function is fired before the
actuall update, and there is allready a row with state = 'approved' in
the table.

Now, I know I could use triggers to achieve desired functionality but I
try to use triggers as seldom as I can. Often ETL scripts disable
triggers so I could end up with data integrity broken.

The 'proper' way to do this (as suggested by earlier posts on this
mailing list) is to use partial UNIQUE indexes, but I have problem with
that too: indexes are not part of DDL (no matter that primary key
constraints and/or unique constraints use indexes to employ those
constraints), and as far as I know there is no 'partial unique
constraint' in SQL?

Does anyone has better suggestion on how to employ the data-integrity
rules I have?

And, wouldn't it be better to have CHECK constraints check the data
AFTER data-modification? I also found no reference on using CHECK
constraints with user-defined functions on postgres manual - there
should be a mention of the way the CHECK constraint works - that is,
function referenced by CHECK constraint is fired BEFORE the actual data
modification occur. The error message is also misleading, for instance,
when I run the before mentioned UPDATE:
constraint_check=# update agreements set agreement_state = 'obsolete'
where subject_id = 1 and agreement_state = 'approved';
ERROR:  new row for relation "agreements" violates check constraint
"check_agreements_onlyone_approved"
Mario


Re: CHECK constraints with plpgsql functions - check 'fires' BEFORE data modification?

От
Jasen Betts
Дата:
On 2010-04-19, Mario Splivalo <mario.splivalo@megafon.hr> wrote:
> The 'data integrity' rule for database I'm designing says that any
> subject we're tracking (persons, companies, whatever) is assigned an
> agreement that can be in several states: 'Approved', 'Unapproved' or
> 'Obsolete'. One subject can have only one (or none) 'Approved' or
> 'Unapproved' agreement, and as many (or none) 'Obsolete' agreements.

...

> The 'proper' way to do this (as suggested by earlier posts on this
> mailing list) is to use partial UNIQUE indexes, but I have problem with
> that too: indexes are not part of DDL (no matter that primary key
> constraints and/or unique constraints use indexes to employ those
> constraints), and as far as I know there is no 'partial unique
> constraint' in SQL?

huh?

create unique index agreements_approved_onlyone on agreements(subject_id) 
where agreement_state='approved';

> And, wouldn't it be better to have CHECK constraints check the data
> AFTER data-modification?

no.

> CREATE FUNCTION check_agreements_onlyone_approved(a_subject_id integer)
> RETURNS boolean AS
> $$
> SELECT
>     CASE COUNT(agreement_id)
>         WHEN 0 THEN true
>         WHEN 1 THEN true
>         ELSE false
>     END FROM agreements WHERE subject_id = $1 AND agreement_state = 'approved';
> $$ LANGUAGE 'sql';
>
> Now, the above does not work because CHECK function is fired BEFORE
> actuall data modification takes place so I can end up with two rows with
> 'approved' state for particular subject_id. If I change the CASE...WHEN
> conditions so that function returns TRUE only when there is 0 rows for
> the state = 'approved' then I have problems with UPDATEing:
>
> UPDATE agreements SET agreement_state = 'obsolete' where subject_id =
><whatever> AND agreement_state = 'approved'
>
> That update will fail because the CHECK function is fired before the
> actuall update, and there is allready a row with state = 'approved' in
> the table.

fix the check so that it knows what the new state will be. then it test
if the proposed new state is compatible with the old state.

(but seriously, first explain why the index doesn't work)

...  CONSTRAINT check_agreements_onlyone_approved CHECK
(check_agreements_onlyone_approved(subject_id,agreement_id,agreement_state)));
CREATE OR REPLACE FUNCTION check_agreements_onlyone_approved(the_subject_id
integer, the_pkey integer, the_new_state enum_agreement_state)RETURNS boolean AS$$SELECT       CASE COUNT(agreement_id)
             WHEN 0 THEN true               WHEN 1 THEN $3 != 'approved'               ELSE false       END FROM
agreementsWHERE subject_id = $1 AND agreement_state
 
= 'approved' and agreement_id != $2 $$ LANGUAGE 'sql';

still not perfect: if you need to change the agreement_id this will
block you from doing that on approved agreements.




Newsgroups: gmane.comp.db.postgresql.sql
From: Jasen Betts <jasen@xnet.co.nz>
Subject: Re: CHECK constraints with plpgsql functions - check 'fires' BEFORE data modification?
References: <4BCC272C.3020505@megafon.hr>
Organization: Dis (not Dat) Organisation
Followup-To: 
X-Face:
?)Aw4rXwN5u0~$nqKj`xPz>xHCwgi^q+^?Ri*+R(&uv2=E1Q0Zk(>h!~o2ID@6{uf8s;a+M[5[U[QT7xFN%^gR"=tuJw%TXXR'Fp~W;(T"1(739R%m0Yyyv*gkGoPA.$b,D.w:z+<'"=-lVT?6{T?=R^:W5g|E2#EhjKCa+nt":4b}dU7GYB*HBxn&Td$@f%.kl^:7X8rQWd[NTc"P"u6nkisze/Q;8"9Z{peQF,w)7UjV$c|RO/mQW/NMgWfr5*$-Z%u46"/00mx-,\R'fLPe.)^

On 2010-04-19, Mario Splivalo <mario.splivalo@megafon.hr> wrote:
> The 'data integrity' rule for database I'm designing says that any
> subject we're tracking (persons, companies, whatever) is assigned an
> agreement that can be in several states: 'Approved', 'Unapproved' or
> 'Obsolete'. One subject can have only one (or none) 'Approved' or
> 'Unapproved' agreement, and as many (or none) 'Obsolete' agreements.
>
> I was thinking on employing the CHECK constraint on agreements table
> that would check that there is only one 'Approved' state per subject.
>
> My (simplified) schema looks like this:
>
> CREATE TYPE enum_agreement_state AS ENUM
>    ('unapproved',
>     'approved',
>     'obsolete');
>
> CREATE TABLE subjects
> (
>   subject_id serial NOT NULL,
>   subject_name character varying NOT NULL,
>   CONSTRAINT subjects_pkey PRIMARY KEY (subject_id)
> );
>
> CREATE TABLE agreements
> (
>   agreement_id serial NOT NULL,
>   subject_id integer NOT NULL,
>   agreement_state enum_agreement_state NOT NULL,
>   CONSTRAINT agreements_pkey PRIMARY KEY (agreement_id),
>   CONSTRAINT agreements_subject_id_fkey FOREIGN KEY (subject_id)
>       REFERENCES subjects (subject_id) MATCH SIMPLE
>       ON UPDATE NO ACTION ON DELETE NO ACTION,

>   CONSTRAINT check_agreements_onlyone_approved CHECK
> (check_agreements_onlyone_approved(subject_id))
> );
>
> CREATE FUNCTION check_agreements_onlyone_approved(a_subject_id integer)
> RETURNS boolean AS
> $$
> SELECT
>     CASE COUNT(agreement_id)
>         WHEN 0 THEN true
>         WHEN 1 THEN true
>         ELSE false
>     END FROM agreements WHERE subject_id = $1 AND agreement_state = 'approved';
> $$ LANGUAGE 'sql';
>
> Now, the above does not work because CHECK function is fired BEFORE
> actuall data modification takes place so I can end up with two rows with
> 'approved' state for particular subject_id. If I change the CASE...WHEN
> conditions so that function returns TRUE only when there is 0 rows for
> the state = 'approved' then I have problems with UPDATEing:
>
> UPDATE agreements SET agreement_state = 'obsolete' where subject_id =
><whatever> AND agreement_state = 'approved'
>
> That update will fail because the CHECK function is fired before the
> actuall update, and there is allready a row with state = 'approved' in
> the table.
>
> Now, I know I could use triggers to achieve desired functionality but I
> try to use triggers as seldom as I can. Often ETL scripts disable
> triggers so I could end up with data integrity broken.
>
> The 'proper' way to do this (as suggested by earlier posts on this
> mailing list) is to use partial UNIQUE indexes, but I have problem with
> that too: indexes are not part of DDL (no matter that primary key
> constraints and/or unique constraints use indexes to employ those
> constraints), and as far as I know there is no 'partial unique
> constraint' in SQL?
>
> Does anyone has better suggestion on how to employ the data-integrity
> rules I have?
>
> And, wouldn't it be better to have CHECK constraints check the data
> AFTER data-modification? I also found no reference on using CHECK
> constraints with user-defined functions on postgres manual - there
> should be a mention of the way the CHECK constraint works - that is,
> function referenced by CHECK constraint is fired BEFORE the actual data
> modification occur. The error message is also misleading, for instance,
> when I run the before mentioned UPDATE:
>
>  constraint_check=# update agreements set agreement_state = 'obsolete'
> where subject_id = 1 and agreement_state = 'approved';
> ERROR:  new row for relation "agreements" violates check constraint
> "check_agreements_onlyone_approved"
>
>     Mario
>



Re: Re: CHECK constraints with plpgsql functions - check 'fires' BEFORE data modification?

От
Mario Splivalo
Дата:
Jasen Betts wrote:
> ...
> 
>> The 'proper' way to do this (as suggested by earlier posts on this
>> mailing list) is to use partial UNIQUE indexes, but I have problem with
>> that too: indexes are not part of DDL (no matter that primary key
>> constraints and/or unique constraints use indexes to employ those
>> constraints), and as far as I know there is no 'partial unique
>> constraint' in SQL?
> 
> huh?
> 
> create unique index agreements_approved_onlyone on agreements(subject_id) 
> where agreement_state='approved';

That will create unique index - the way I'm doing it now. The difference
is just semantic - indexes are not part of DDL, they are RDBMS's aids on
improving database performance.

>> And, wouldn't it be better to have CHECK constraints check the data
>> AFTER data-modification?
> 
> no.

Why not?

>> Now, the above does not work because CHECK function is fired BEFORE
>> actuall data modification takes place so I can end up with two rows with
>> 'approved' state for particular subject_id. If I change the CASE...WHEN
>> conditions so that function returns TRUE only when there is 0 rows for
>> the state = 'approved' then I have problems with UPDATEing:
>>
>> UPDATE agreements SET agreement_state = 'obsolete' where subject_id =
>> <whatever> AND agreement_state = 'approved'
>>
>> That update will fail because the CHECK function is fired before the
>> actuall update, and there is allready a row with state = 'approved' in
>> the table.
> 
> fix the check so that it knows what the new state will be. then it test
> if the proposed new state is compatible with the old state.

Yes, but for that I need to know weather DML command is INSERT or
UPDATE, and function called by check constraint has no way of knowing
that. Yes, I know that I could use trigger, even constraint trigger, but
I prefer not to use triggers if I don't have to.

> (but seriously, first explain why the index doesn't work)

It does work, and the more I look at it that seems to be the only proper
way of doing what I want. But, as I've said, indexes are not DDL, and
strictly my data-integrity rule needs to be done within DDL, not
indexes. Since there is no other way I will, of course, use indexes.

(Similarly I'd encourage one to use UNIQUE CONSTRAINT over UNIQUE INDEX
because CONSTRAINTS are part of DDL, while indexes are not. It doesn't
matter that postgres, and any other RDBMS for that matter, will empower
UNIQUE CONSTRAINT using indexes.)

> 
> ...
>    CONSTRAINT check_agreements_onlyone_approved CHECK
>    (check_agreements_onlyone_approved(subject_id,agreement_id,agreement_state))
>  );
> 
>  CREATE OR REPLACE FUNCTION check_agreements_onlyone_approved(the_subject_id
> integer, the_pkey integer, the_new_state enum_agreement_state)
>  RETURNS boolean AS
>  $$
>  SELECT
>         CASE COUNT(agreement_id)
>                 WHEN 0 THEN true
>                 WHEN 1 THEN $3 != 'approved'
>                 ELSE false
>         END FROM agreements WHERE subject_id = $1 AND agreement_state
> = 'approved' and agreement_id != $2 
>  $$ LANGUAGE 'sql';
> 
> still not perfect: if you need to change the agreement_id this will
> block you from doing that on approved agreements.

Yes, this seems more and more as a bad idea :)
Mario


Re: Re: CHECK constraints with plpgsql functions - check 'fires' BEFORE data modification?

От
Tom Lane
Дата:
Jasen Betts <jasen@xnet.co.nz> writes:
> On 2010-04-19, Mario Splivalo <mario.splivalo@megafon.hr> wrote:
>> The 'proper' way to do this (as suggested by earlier posts on this
>> mailing list) is to use partial UNIQUE indexes, but I have problem with
>> that too: indexes are not part of DDL (no matter that primary key
>> constraints and/or unique constraints use indexes to employ those
>> constraints), and as far as I know there is no 'partial unique
>> constraint' in SQL?

> huh?

I think what Mario is actually complaining about is that partial unique
indexes are not part of the SQL standard, and he wants a solution that
at least gives the illusion that it might be portable to some other
RDBMS in the future.

Unfortunately, an illusion is all it would be.  Even presuming that the
other DBMS lets you run plpgsql-equivalent functions in CHECK
constraints, the whole approach is broken by concurrency considerations.
If you have two transactions simultaneously inserting rows that would be
valid given the prior state of the table, but it's *not* valid for them
both to be present, then a CHECK or trigger-based constraint is going to
fail, because neither transaction will see the other's uncommitted row.
At least that's how it works in Postgres.  In some other DBMS it might
work differently, but you're right back up against the fact that your
solution is not portable.

Unique constraints (partial or otherwise) deal with the race-condition
problem by doing low-level things that aren't exposed at the SQL level.
So there's simply no way to get the equivalent behavior in pure standard
SQL.

>> And, wouldn't it be better to have CHECK constraints check the data
>> AFTER data-modification?

> no.

Indeed.  The race condition is still there.  CHECK is meant to handle
constraints on a row's value *in isolation*.  If you try to use it to
enforce cross-row conditions, the project will certainly end badly.
        regards, tom lane


Re: Re: CHECK constraints with plpgsql functions - check 'fires' BEFORE data modification?

От
Mario Splivalo
Дата:
Tom Lane wrote:
> 
> I think what Mario is actually complaining about is that partial unique
> indexes are not part of the SQL standard, and he wants a solution that
> at least gives the illusion that it might be portable to some other
> RDBMS in the future.

Correct. As far as I can see there is no partial unique constraints 
defined within the standard, and there are check constraints, so... 
Although, especially after this:

> Unfortunately, an illusion is all it would be.  Even presuming that the
> other DBMS lets you run plpgsql-equivalent functions in CHECK
> constraints, the whole approach is broken by concurrency considerations.
> If you have two transactions simultaneously inserting rows that would be
> valid given the prior state of the table, but it's *not* valid for them
> both to be present, then a CHECK or trigger-based constraint is going to
> fail, because neither transaction will see the other's uncommitted row.
> At least that's how it works in Postgres.  In some other DBMS it might
> work differently, but you're right back up against the fact that your
> solution is not portable.
> 
> Unique constraints (partial or otherwise) deal with the race-condition
> problem by doing low-level things that aren't exposed at the SQL level.
> So there's simply no way to get the equivalent behavior in pure standard
> SQL.

Thank you for the clarification, it's easy to understand now why using 
check constraints is a bad idea for the purpose I wanted to use them. 
It's also easy to see why 'proper' way is using partial unique indexes.

Still, one has to wonder why there are no partial unique constraints 
defined in SQL standard :)
Mario