Обсуждение: plpgsql constraint checked data fails to restore
I have a database running 8.0.1 One of the tables uses a plpgsql function as a check constraint. There is data in the table that passed the constraint. The problem comes when trying to restore the database using a file created by pg_dump. Some of the data required by the check function is being restored after the data being checked by the function and so it all fails the constraint. I have a small (200 lines) dump file which shows the problem if that will help. Is there a way to ensure that the checked data gets listed last in the dump file? Some other way? I actually have this database deployed on 7.4.x so a fix that also works for that series would be ideal... Thanks for your time. _________________________________________________________________ Don't just search. Find. Check out the new MSN Search! http://search.msn.click-url.com/go/onm00200636ave/direct/01/
On Mon, Jun 20, 2005 at 05:49:05AM +0430, Lee Harr wrote: > > I have a database running 8.0.1 > > One of the tables uses a plpgsql function as a > check constraint. There is data in the table that > passed the constraint. > > The problem comes when trying to restore the > database using a file created by pg_dump. > > Some of the data required by the check function > is being restored after the data being checked > by the function and so it all fails the constraint. Are you saying that the check function perform queries against other data? That might not be a good idea -- consider what happens if the data changes: would changes invalidate records that had previously passed the check but that wouldn't pass now if they were checked again? What steps are you taking to ensure integrity not only when a record is inserted, but also when the data the record depends on is changed? > I have a small (200 lines) dump file which shows > the problem if that will help. > > Is there a way to ensure that the checked data > gets listed last in the dump file? Some other > way? It sounds like you've introduced a dependency that PostgreSQL doesn't know about, so pg_dump doesn't know that certain data needs to be restored first. Perhaps you could use the custom dump format and create an automated mechanism to reorder objects at restore time. Another possibility might be to muck around with naming in an attempt to get certain objects dumped first, although that might be subject to breaking again in the future. Yet another possibility would be to perform multiple dumps using options like --table or --schema and make sure you restore them in the proper order. Maybe somebody else can propose other possibilities. Perhaps you should rethink using a check constraint that depends on other data -- what's that check do? Is it something that a foreign key constraint couldn't handle? -- Michael Fuhr http://www.fuhr.org/~mfuhr/
I don't have personal experience with this, but I thought it was fairly common to DEFER constraint checking until after a restore completed, for exactly this reason. - John Burger MITRE
On Mon, Jun 20, 2005 at 08:18:00AM -0400, John Burger wrote: > > I don't have personal experience with this, but I thought it was fairly > common to DEFER constraint checking until after a restore completed, > for exactly this reason. Regarding DEFERRABLE, according to the CREATE TABLE documentation, "Only foreign key constraints currently accept this clause. All other constraint types are not deferrable." http://www.postgresql.org/docs/8.0/static/sql-createtable.html pg_dump could defer adding CHECK constraints until after data is restored, but currently it doesn't behave that way -- it usually emits CHECK constraints with the CREATE TABLE statements, and I don't see an option in the documentation to change that behavior. Search the list archives for past discussion; for example: http://archives.postgresql.org/pgsql-hackers/2004-03/msg00035.php http://archives.postgresql.org/pgsql-hackers/2004-03/msg00755.php It should be possible to use a trigger instead of a CHECK constraint because pg_dump adds triggers after data is restored. But without careful design you still have the same integrity problem as with a CHECK constraint: data is checked when it's inserted or updated, but not when the referred-to data is modified. For that you'd need another constraint or trigger that knows how to check all the dependent data. -- Michael Fuhr http://www.fuhr.org/~mfuhr/
>>Some of the data required by the check function >>is being restored after the data being checked >>by the function and so it all fails the constraint. > >Are you saying that the check function perform queries against other >data? That might not be a good idea -- consider what happens if >the data changes: would changes invalidate records that had previously >passed the check but that wouldn't pass now if they were checked >again? You ask some great questions. Thanks. I think maybe I just got a little constraint-happy. The way I have it, there is definitely a possibility for the other data to change out from under the constraint. That can't be good. Right now, I don't really see another way to check what I wanted to check, so I am just going to remove the constraint. When I get a few minutes, I will post my simplified example and maybe someone will have a good idea. Thanks again. _________________________________________________________________ Express yourself instantly with MSN Messenger! Download today it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/
On 6/20/2005 1:23 PM, Lee Harr wrote: >>>Some of the data required by the check function >>>is being restored after the data being checked >>>by the function and so it all fails the constraint. >> >>Are you saying that the check function perform queries against other >>data? That might not be a good idea -- consider what happens if >>the data changes: would changes invalidate records that had previously >>passed the check but that wouldn't pass now if they were checked >>again? > > You ask some great questions. Thanks. But not the really important one :-) > > I think maybe I just got a little constraint-happy. The way I have > it, there is definitely a possibility for the other data to change > out from under the constraint. That can't be good. > > Right now, I don't really see another way to check what I wanted > to check, so I am just going to remove the constraint. > > When I get a few minutes, I will post my simplified example and > maybe someone will have a good idea. The question I have is how exactly you manage to get the trigger fired when restoring the dump. By default, the dump created by pg_dump will create the table, fill in the data and create the trigger(s) only after that. From that I conclude that you are taking a data-only dump and restore the schema first either from a text file or a separate pg_dump schema only. If you do keep your schema in external ascii files and do data-only dumps, you have to split the schema into table creation (without constraints, indexes, etc.) and a second part that adds all the constraints and indexes after the data is loaded. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
On Thu, Jun 23, 2005 at 11:47:37AM -0400, Jan Wieck wrote: > On 6/20/2005 1:23 PM, Lee Harr wrote: > > > >You ask some great questions. Thanks. > > But not the really important one :-) Maybe that's because it didn't need asking :-) > The question I have is how exactly you manage to get the trigger fired > when restoring the dump. By default, the dump created by pg_dump will > create the table, fill in the data and create the trigger(s) only after > that. Not true for CHECK constraints -- pg_dump creates them with the CREATE TABLE statement: CREATE TABLE foo ( id integer PRIMARY KEY ); CREATE TABLE bar ( fooid integer NOT NULL REFERENCES foo, x integer CHECK (x > 0) ); INSERT INTO foo (id) VALUES (1); INSERT INTO foo (id) VALUES (2); INSERT INTO bar (fooid, x) VALUES (1, 2); INSERT INTO bar (fooid, x) VALUES (2, 3); pg_dump testdb [...] CREATE TABLE bar ( fooid integer NOT NULL, x integer, CONSTRAINT bar_x_check CHECK ((x > 0)) ); [...] CREATE TABLE foo ( id integer NOT NULL ); [...] COPY bar (fooid, x) FROM stdin; 1 2 2 3 \. [...] COPY foo (id) FROM stdin; 1 2 \. [...] ALTER TABLE ONLY foo ADD CONSTRAINT foo_pkey PRIMARY KEY (id); [...] ALTER TABLE ONLY bar ADD CONSTRAINT bar_fooid_fkey FOREIGN KEY (fooid) REFERENCES foo(id); -- Michael Fuhr http://www.fuhr.org/~mfuhr/
Added pgsql-hackers Added Bruce Momjian On 6/23/2005 12:19 PM, Michael Fuhr wrote: >> The question I have is how exactly you manage to get the trigger fired >> when restoring the dump. By default, the dump created by pg_dump will >> create the table, fill in the data and create the trigger(s) only after >> that. > > Not true for CHECK constraints -- pg_dump creates them with the > CREATE TABLE statement: This is still true in 8.1's pg_dump, even though check constraints can be added later. Even though it is bad practice to have functions that rely on or even manipulate other objects in a CHECK constraint, I think pg_dump should add the check constraints in the same manner as it does triggers. Bruce, do we have a TODO item for this? Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
Jan Wieck <JanWieck@Yahoo.com> writes: > I think pg_dump should add the check constraints in the same manner as > it does triggers. > Bruce, do we have a TODO item for this? No, because that idea has been proposed and rejected before --- it adds overhead (extra table scans) and reduces readability of the SQL dump, in order to "support" a programming technique that will never really work correctly anyway. A CHECK constraint that depends on anything more than the content of the row being checked is simply wrong. Essentially what we implement is what SQL92 calls the "intermediate" level of CHECK support: 1) The following restrictions apply for Intermediate SQL: a) The <search condition> contained in a <check constraint defi- nition> shall not contain a <subquery>. regression=# create table bbb(f2 int check (f2 in (select f1 from aaa))); ERROR: cannot use subquery in check constraint Of course, a function call that executes a query internally is simply a cheat to try to bypass this restriction; the fact that we don't catch you cheating doesn't mean we promise it will work. The function call is a lot worse, in fact, because there is no way pg_dump can even detect the data dependency, and thus no way to know when it is safe to add the check constraint. There is no point in changing the behavior of pg_dump until and unless we improve the handling of CHECK constraints to support subqueries --- which is more or less the same thing as supporting database-wide ASSERTIONs, and I don't know of anybody even thinking of working on that. regards, tom lane