Обсуждение: plpgsql constraint checked data fails to restore

Поиск
Список
Период
Сортировка

plpgsql constraint checked data fails to restore

От
"Lee Harr"
Дата:
I have a database running 8.0.1

One of the tables uses a plpgsql function as a
check constraint. There is data in the table that
passed the constraint.

The problem comes when trying to restore the
database using a file created by pg_dump.

Some of the data required by the check function
is being restored after the data being checked
by the function and so it all fails the constraint.

I have a small (200 lines) dump file which shows
the problem if that will help.

Is there a way to ensure that the checked data
gets listed last in the dump file? Some other
way?

I actually have this database deployed on 7.4.x
so a fix that also works for that series would be
ideal...

Thanks for your time.

_________________________________________________________________
Don't just search. Find. Check out the new MSN Search!
http://search.msn.click-url.com/go/onm00200636ave/direct/01/


Re: plpgsql constraint checked data fails to restore

От
Michael Fuhr
Дата:
On Mon, Jun 20, 2005 at 05:49:05AM +0430, Lee Harr wrote:
>
> I have a database running 8.0.1
>
> One of the tables uses a plpgsql function as a
> check constraint. There is data in the table that
> passed the constraint.
>
> The problem comes when trying to restore the
> database using a file created by pg_dump.
>
> Some of the data required by the check function
> is being restored after the data being checked
> by the function and so it all fails the constraint.

Are you saying that the check function perform queries against other
data?  That might not be a good idea -- consider what happens if
the data changes: would changes invalidate records that had previously
passed the check but that wouldn't pass now if they were checked
again?  What steps are you taking to ensure integrity not only when
a record is inserted, but also when the data the record depends on
is changed?

> I have a small (200 lines) dump file which shows
> the problem if that will help.
>
> Is there a way to ensure that the checked data
> gets listed last in the dump file? Some other
> way?

It sounds like you've introduced a dependency that PostgreSQL doesn't
know about, so pg_dump doesn't know that certain data needs to be
restored first.  Perhaps you could use the custom dump format and
create an automated mechanism to reorder objects at restore time.
Another possibility might be to muck around with naming in an attempt
to get certain objects dumped first, although that might be subject
to breaking again in the future.  Yet another possibility would be
to perform multiple dumps using options like --table or --schema
and make sure you restore them in the proper order.  Maybe somebody
else can propose other possibilities.

Perhaps you should rethink using a check constraint that depends
on other data -- what's that check do?  Is it something that a
foreign key constraint couldn't handle?

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/

Re: plpgsql constraint checked data fails to restore

От
John Burger
Дата:
I don't have personal experience with this, but I thought it was fairly
common to DEFER constraint checking until after a restore completed,
for exactly this reason.

- John Burger
   MITRE



Re: plpgsql constraint checked data fails to restore

От
Michael Fuhr
Дата:
On Mon, Jun 20, 2005 at 08:18:00AM -0400, John Burger wrote:
>
> I don't have personal experience with this, but I thought it was fairly
> common to DEFER constraint checking until after a restore completed,
> for exactly this reason.

Regarding DEFERRABLE, according to the CREATE TABLE documentation,
"Only foreign key constraints currently accept this clause.  All
other constraint types are not deferrable."

http://www.postgresql.org/docs/8.0/static/sql-createtable.html

pg_dump could defer adding CHECK constraints until after data is
restored, but currently it doesn't behave that way -- it usually
emits CHECK constraints with the CREATE TABLE statements, and I
don't see an option in the documentation to change that behavior.
Search the list archives for past discussion; for example:

http://archives.postgresql.org/pgsql-hackers/2004-03/msg00035.php
http://archives.postgresql.org/pgsql-hackers/2004-03/msg00755.php

It should be possible to use a trigger instead of a CHECK constraint
because pg_dump adds triggers after data is restored.  But without
careful design you still have the same integrity problem as with a
CHECK constraint: data is checked when it's inserted or updated,
but not when the referred-to data is modified.  For that you'd need
another constraint or trigger that knows how to check all the
dependent data.

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/

Re: plpgsql constraint checked data fails to restore

От
"Lee Harr"
Дата:
>>Some of the data required by the check function
>>is being restored after the data being checked
>>by the function and so it all fails the constraint.
>
>Are you saying that the check function perform queries against other
>data?  That might not be a good idea -- consider what happens if
>the data changes: would changes invalidate records that had previously
>passed the check but that wouldn't pass now if they were checked
>again?

You ask some great questions. Thanks.

I think maybe I just got a little constraint-happy. The way I have
it, there is definitely a possibility for the other data to change
out from under the constraint. That can't be good.

Right now, I don't really see another way to check what I wanted
to check, so I am just going to remove the constraint.

When I get a few minutes, I will post my simplified example and
maybe someone will have a good idea.

Thanks again.

_________________________________________________________________
Express yourself instantly with MSN Messenger! Download today it's FREE!
http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/


Re: plpgsql constraint checked data fails to restore

От
Jan Wieck
Дата:
On 6/20/2005 1:23 PM, Lee Harr wrote:

>>>Some of the data required by the check function
>>>is being restored after the data being checked
>>>by the function and so it all fails the constraint.
>>
>>Are you saying that the check function perform queries against other
>>data?  That might not be a good idea -- consider what happens if
>>the data changes: would changes invalidate records that had previously
>>passed the check but that wouldn't pass now if they were checked
>>again?
>
> You ask some great questions. Thanks.

But not the really important one :-)

>
> I think maybe I just got a little constraint-happy. The way I have
> it, there is definitely a possibility for the other data to change
> out from under the constraint. That can't be good.
>
> Right now, I don't really see another way to check what I wanted
> to check, so I am just going to remove the constraint.
>
> When I get a few minutes, I will post my simplified example and
> maybe someone will have a good idea.

The question I have is how exactly you manage to get the trigger fired
when restoring the dump. By default, the dump created by pg_dump will
create the table, fill in the data and create the trigger(s) only after
that. From that I conclude that you are taking a data-only dump and
restore the schema first either from a text file or a separate pg_dump
schema only.

If you do keep your schema in external ascii files and do data-only
dumps, you have to split the schema into table creation (without
constraints, indexes, etc.) and a second part that adds all the
constraints and indexes after the data is loaded.


Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #

Re: plpgsql constraint checked data fails to restore

От
Michael Fuhr
Дата:
On Thu, Jun 23, 2005 at 11:47:37AM -0400, Jan Wieck wrote:
> On 6/20/2005 1:23 PM, Lee Harr wrote:
> >
> >You ask some great questions. Thanks.
>
> But not the really important one :-)

Maybe that's because it didn't need asking :-)

> The question I have is how exactly you manage to get the trigger fired
> when restoring the dump. By default, the dump created by pg_dump will
> create the table, fill in the data and create the trigger(s) only after
> that.

Not true for CHECK constraints -- pg_dump creates them with the
CREATE TABLE statement:

CREATE TABLE foo (
    id  integer PRIMARY KEY
);

CREATE TABLE bar (
    fooid  integer NOT NULL REFERENCES foo,
    x      integer CHECK (x > 0)
);

INSERT INTO foo (id) VALUES (1);
INSERT INTO foo (id) VALUES (2);

INSERT INTO bar (fooid, x) VALUES (1, 2);
INSERT INTO bar (fooid, x) VALUES (2, 3);

pg_dump testdb

[...]

CREATE TABLE bar (
    fooid integer NOT NULL,
    x integer,
    CONSTRAINT bar_x_check CHECK ((x > 0))
);

[...]

CREATE TABLE foo (
    id integer NOT NULL
);

[...]

COPY bar (fooid, x) FROM stdin;
1       2
2       3
\.

[...]

COPY foo (id) FROM stdin;
1
2
\.

[...]

ALTER TABLE ONLY foo
    ADD CONSTRAINT foo_pkey PRIMARY KEY (id);

[...]

ALTER TABLE ONLY bar
    ADD CONSTRAINT bar_fooid_fkey FOREIGN KEY (fooid) REFERENCES foo(id);

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/

Re: plpgsql constraint checked data fails to restore

От
Jan Wieck
Дата:
Added pgsql-hackers
Added Bruce Momjian


On 6/23/2005 12:19 PM, Michael Fuhr wrote:
>> The question I have is how exactly you manage to get the trigger fired
>> when restoring the dump. By default, the dump created by pg_dump will
>> create the table, fill in the data and create the trigger(s) only after
>> that.
>
> Not true for CHECK constraints -- pg_dump creates them with the
> CREATE TABLE statement:

This is still true in 8.1's pg_dump, even though check constraints can
be added later. Even though it is bad practice to have functions that
rely on or even manipulate other objects in a CHECK constraint, I think
pg_dump should add the check constraints in the same manner as it does
triggers.

Bruce, do we have a TODO item for this?


Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #

Re: plpgsql constraint checked data fails to restore

От
Tom Lane
Дата:
Jan Wieck <JanWieck@Yahoo.com> writes:
> I think pg_dump should add the check constraints in the same manner as
> it does triggers.

> Bruce, do we have a TODO item for this?

No, because that idea has been proposed and rejected before --- it adds
overhead (extra table scans) and reduces readability of the SQL dump,
in order to "support" a programming technique that will never really
work correctly anyway.  A CHECK constraint that depends on anything more
than the content of the row being checked is simply wrong.

Essentially what we implement is what SQL92 calls the "intermediate"
level of CHECK support:

         1) The following restrictions apply for Intermediate SQL:

            a) The <search condition> contained in a <check constraint defi-
              nition> shall not contain a <subquery>.

regression=# create table bbb(f2 int check (f2 in (select f1 from aaa)));
ERROR:  cannot use subquery in check constraint

Of course, a function call that executes a query internally is simply
a cheat to try to bypass this restriction; the fact that we don't catch
you cheating doesn't mean we promise it will work.  The function call
is a lot worse, in fact, because there is no way pg_dump can even detect
the data dependency, and thus no way to know when it is safe to add the
check constraint.

There is no point in changing the behavior of pg_dump until and unless
we improve the handling of CHECK constraints to support subqueries ---
which is more or less the same thing as supporting database-wide
ASSERTIONs, and I don't know of anybody even thinking of working on that.

            regards, tom lane