Обсуждение: pg_restore restores out of order

Поиск
Список
Период
Сортировка

pg_restore restores out of order

От
Kevin Brannen
Дата:
I'm working thru our system to upgrade from 9.6.5 to 12.2. One of the last things I'm verifying is backup/restore and
it'sfailing for no reason I can figure out. So I'm looking for pointers on this. 

If it matters, the code is from the version 12.2 from the Pg site, RPMs for Centos 6 (.10).

The backup is made like:

# $PGPATH/pg_dump --format=d --jobs=2 --file=$exp --dbname=nms --schema=public

As far as I can tell, it worked well as the result is about the right size, exit code was at least 0. I then go change
acouple of things so I know if the restore works or not (i.e. they should disappear). 

Restore looks something like:

# cd $exp
# echo "
ALTER SCHEMA public RENAME TO savepublic;
CREATE SCHEMA public AUTHORIZATION nmsroot;
" | $PGPATH/psql -d nms
# /usr/pgsql-12/bin/pg_restore --jobs=2 --dbname=nms --schema=public .

So we move the current schema to the side just in case something goes wrong and we need to move it back, create an
emptyschema for it, then restore into that schema. Then it goes bad... 

pg_restore: while PROCESSING TOC:
pg_restore: from TOC entry 6356; 0 23653 TABLE DATA subscribers nmsroot
pg_restore: error: COPY failed for table "subscribers": ERROR:  relation "su_profiles" does not exist
LINE 1: SELECT srvc_data           FROM su_profiles WHERE su_profile...
                                        ^
QUERY:  SELECT srvc_data           FROM su_profiles WHERE su_profile_pk = p_profile_fk
CONTEXT:  PL/pgSQL function public.check_data_ip(integer,integer,character varying) line 7 at SQL statement
COPY subscribers, line 1: "61   1002    \N      SU_1002 t       \N      4       \N      1       250     0
2015-06-2216:56:27.79333+00     nmsmgr  \N      \N      \N" 

And it goes further downhill with missing FK's, etc in other tables later because of the missing data. The exit code is
definitelynot 0. 

I've tried removing the "--jobs=2" so it'll go single threaded -- it still fails. I also did a new dump with the
defaultplain text dump, and the restore with that also failed in the same way. Checking out the backup itself, I get: 

# pg_restore -Fd . --list | egrep 'su_profile|subscriber' > k1

254; 1259 23653 TABLE public subscribers nmsroot
335; 1259 24222 TABLE public su_profiles nmsroot
5442; 2604 18650 DEFAULT public su_profiles su_profile_pk nmsroot
5303; 2604 18651 DEFAULT public subscribers subscriber_pk nmsroot
6437; 0 24222 TABLE DATA public su_profiles nmsroot
6356; 0 23653 TABLE DATA public subscribers nmsroot
....... constraints, index, triggers, FK -- down here

I've removed FUNCTIONS, VIEWS, & SEQ objects from the list. That list is in order that is in the file. I suppose
creatingthe tables in that order is fine, but there is a FK constraint in subscribers pointing to su_profiles, so that
orderdoes worry me a little, except all the FK stuff is after the data load so I really should be fine. The data looks
likeit'll be in the right order (at least for --jobs=1, yet that failed too). 

I don't see anything extra in the Pg error log either.

I don't think I'm running into any issues with "other DB objects being missing". Looking at the release notes, I see
therewere some changes for 12.1 with parallel restores, but I can't tell if that's affecting me or not. I know that the
publicschema & search_path changed for dump & restore in 10.3, but this doesn't look like that's a problem here. I've
rereadthe doc pages on pg_dump and pg_restore and don't see anything that look like it'd be bad for me. 

Can anyone see anything that looks wrong? Suggest something else to try or look at?

Kevin
This e-mail transmission, and any documents, files or previous e-mail messages attached to it, may contain confidential
information.If you are not the intended recipient, or a person responsible for delivering it to the intended recipient,
youare hereby notified that any disclosure, distribution, review, copy or use of any of the information contained in or
attachedto this message is STRICTLY PROHIBITED. If you have received this transmission in error, please immediately
notifyus by reply e-mail, and destroy the original transmission and its attachments without reading them or saving them
todisk. Thank you. 



Re: pg_restore restores out of order

От
Adrian Klaver
Дата:
On 3/11/20 2:46 PM, Kevin Brannen wrote:
> I'm working thru our system to upgrade from 9.6.5 to 12.2. One of the last things I'm verifying is backup/restore and
it'sfailing for no reason I can figure out. So I'm looking for pointers on this.
 
> 
> If it matters, the code is from the version 12.2 from the Pg site, RPMs for Centos 6 (.10).
> 
> The backup is made like:
> 
> # $PGPATH/pg_dump --format=d --jobs=2 --file=$exp --dbname=nms --schema=public

Which version of pg_dump 9.6.5 or 12.2?

More below.
> 
> As far as I can tell, it worked well as the result is about the right size, exit code was at least 0. I then go
changea couple of things so I know if the restore works or not (i.e. they should disappear).
 
> 
> Restore looks something like:
> 
> # cd $exp
> # echo "
> ALTER SCHEMA public RENAME TO savepublic;
> CREATE SCHEMA public AUTHORIZATION nmsroot;
> " | $PGPATH/psql -d nms
> # /usr/pgsql-12/bin/pg_restore --jobs=2 --dbname=nms --schema=public .
> 
> So we move the current schema to the side just in case something goes wrong and we need to move it back, create an
emptyschema for it, then restore into that schema. Then it goes bad...
 
> 
> pg_restore: while PROCESSING TOC:
> pg_restore: from TOC entry 6356; 0 23653 TABLE DATA subscribers nmsroot
> pg_restore: error: COPY failed for table "subscribers": ERROR:  relation "su_profiles" does not exist
> LINE 1: SELECT srvc_data           FROM su_profiles WHERE su_profile...
>                                          ^
> QUERY:  SELECT srvc_data           FROM su_profiles WHERE su_profile_pk = p_profile_fk
> CONTEXT:  PL/pgSQL function public.check_data_ip(integer,integer,character varying) line 7 at SQL statement
> COPY subscribers, line 1: "61   1002    \N      SU_1002 t       \N      4       \N      1       250     0
2015-06-2216:56:27.79333+00     nmsmgr  \N      \N      \N"
 

What is in public.check_data_ip(integer,integer,character varying)?

Is it a trigger function on subscribers?

> 
> And it goes further downhill with missing FK's, etc in other tables later because of the missing data. The exit code
isdefinitely not 0.
 
> 
> I've tried removing the "--jobs=2" so it'll go single threaded -- it still fails. I also did a new dump with the
defaultplain text dump, and the restore with that also failed in the same way. Checking out the backup itself, I get:
 
> 
> # pg_restore -Fd . --list | egrep 'su_profile|subscriber' > k1
> 
> 254; 1259 23653 TABLE public subscribers nmsroot
> 335; 1259 24222 TABLE public su_profiles nmsroot
> 5442; 2604 18650 DEFAULT public su_profiles su_profile_pk nmsroot
> 5303; 2604 18651 DEFAULT public subscribers subscriber_pk nmsroot
> 6437; 0 24222 TABLE DATA public su_profiles nmsroot
> 6356; 0 23653 TABLE DATA public subscribers nmsroot
> ....... constraints, index, triggers, FK -- down here
> 
> I've removed FUNCTIONS, VIEWS, & SEQ objects from the list. That list is in order that is in the file. I suppose
creatingthe tables in that order is fine, but there is a FK constraint in subscribers pointing to su_profiles, so that
orderdoes worry me a little, except all the FK stuff is after the data load so I really should be fine. The data looks
likeit'll be in the right order (at least for --jobs=1, yet that failed too).
 
> 
> I don't see anything extra in the Pg error log either.
> 
> I don't think I'm running into any issues with "other DB objects being missing". Looking at the release notes, I see
therewere some changes for 12.1 with parallel restores, but I can't tell if that's affecting me or not. I know that the
publicschema & search_path changed for dump & restore in 10.3, but this doesn't look like that's a problem here. I've
rereadthe doc pages on pg_dump and pg_restore and don't see anything that look like it'd be bad for me.
 
> 
> Can anyone see anything that looks wrong? Suggest something else to try or look at?
> 
> Kevin
> This e-mail transmission, and any documents, files or previous e-mail messages attached to it, may contain
confidentialinformation. If you are not the intended recipient, or a person responsible for delivering it to the
intendedrecipient, you are hereby notified that any disclosure, distribution, review, copy or use of any of the
informationcontained in or attached to this message is STRICTLY PROHIBITED. If you have received this transmission in
error,please immediately notify us by reply e-mail, and destroy the original transmission and its attachments without
readingthem or saving them to disk. Thank you.
 
> 
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com



RE: pg_restore restores out of order

От
Kevin Brannen
Дата:
>Adrian Klaver wrote:
>On 3/11/20 2:46 PM, Kevin Brannen wrote:
>> I'm working thru our system to upgrade from 9.6.5 to 12.2. One of the last things I'm verifying is backup/restore
andit's failing for no reason I can figure out. So I'm looking for pointers on this.
 
>>
>> If it matters, the code is from the version 12.2 from the Pg site, RPMs for Centos 6 (.10).
>>
>> The backup is made like:
>>
>> # $PGPATH/pg_dump --format=d --jobs=2 --file=$exp --dbname=nms
>> --schema=public
>
>Which version of pg_dump 9.6.5 or 12.2?

Both pg_dump and pg_restore are with 12.2, on an already converted 12.2 DB.
So I'm 12.2 all the way on my test system by this point. :)

>>...
>> So we move the current schema to the side just in case something goes wrong and we need to move it back, create an
emptyschema for it, then restore into that schema. Then it goes bad...
 
>>
>> pg_restore: while PROCESSING TOC:
>> pg_restore: from TOC entry 6356; 0 23653 TABLE DATA subscribers
>> nmsroot
>> pg_restore: error: COPY failed for table "subscribers": ERROR:  relation "su_profiles" does not exist
>> LINE 1: SELECT srvc_data           FROM su_profiles WHERE su_profile...
>>                                          ^
>> QUERY:  SELECT srvc_data           FROM su_profiles WHERE su_profile_pk = p_profile_fk
>> CONTEXT:  PL/pgSQL function public.check_data_ip(integer,integer,character varying) line 7 at SQL statement
>> COPY subscribers, line 1: "61   1002    \N      SU_1002 t       \N      4       \N      1       250     0
2015-06-2216:56:27.79333+00     nmsmgr  \N      \N      \N"
 
>
>What is in public.check_data_ip(integer,integer,character varying)?
>
>Is it a trigger function on subscribers?

Not quite...

nms=# \d subscribers
                       Table "public.subscribers"
...
Check constraints:
    "chk_su_data_ip" CHECK (check_data_ip(profile_fk, unit_id, data_ip))


The first line of the check_data_ip() function is the offender:

SELECT srvc_data INTO data
FROM su_profiles
WHERE su_profile_pk = p_profile_fk;

Hmm, why is this a problem now and not before?

(Probably the usual reason of code "tightening" as we go forward, or so I'd guess.)

I don't see any option that mentions "CHECK" in the pg_restore doc in regards to
possibly turning it off...more research...

And oh, this hurts. :( From the docs on CHECK constraints:

PostgreSQL does not support CHECK constraints that reference table data other
than the new or updated row being checked. While a CHECK constraint that
violates this rule may appear to work in simple tests, it cannot guarantee
that the database will not reach a state in which the constraint condition
is false (due to subsequent changes of the other row(s) involved). This
would cause a database dump and reload to fail. ...

It goes on to say a trigger is the right way to do this. So yeah, a "tightening
of the code" seems to be hitting me because of bad code from the past.

Problem identified and I'll solve it tomorrow.

Thank you so much Adrian for helping me to figure this out!

Kevin

.
This e-mail transmission, and any documents, files or previous e-mail messages attached to it, may contain confidential
information.If you are not the intended recipient, or a person responsible for delivering it to the intended recipient,
youare hereby notified that any disclosure, distribution, review, copy or use of any of the information contained in or
attachedto this message is STRICTLY PROHIBITED. If you have received this transmission in error, please immediately
notifyus by reply e-mail, and destroy the original transmission and its attachments without reading them or saving them
todisk. Thank you.
 

Re: pg_restore restores out of order

От
Adrian Klaver
Дата:
On 3/11/20 4:11 PM, Kevin Brannen wrote:
>> Adrian Klaver wrote:
>> On 3/11/20 2:46 PM, Kevin Brannen wrote:
>>> I'm working thru our system to upgrade from 9.6.5 to 12.2. One of the last things I'm verifying is backup/restore
andit's failing for no reason I can figure out. So I'm looking for pointers on this.
 
>>>
>>> If it matters, the code is from the version 12.2 from the Pg site, RPMs for Centos 6 (.10).
>>>
>>> The backup is made like:
>>>
>>> # $PGPATH/pg_dump --format=d --jobs=2 --file=$exp --dbname=nms
>>> --schema=public
>>
>> Which version of pg_dump 9.6.5 or 12.2?
> 
> Both pg_dump and pg_restore are with 12.2, on an already converted 12.2 DB.
> So I'm 12.2 all the way on my test system by this point. :)
> 
>>> ...
>>> So we move the current schema to the side just in case something goes wrong and we need to move it back, create an
emptyschema for it, then restore into that schema. Then it goes bad...
 
>>>
>>> pg_restore: while PROCESSING TOC:
>>> pg_restore: from TOC entry 6356; 0 23653 TABLE DATA subscribers
>>> nmsroot
>>> pg_restore: error: COPY failed for table "subscribers": ERROR:  relation "su_profiles" does not exist
>>> LINE 1: SELECT srvc_data           FROM su_profiles WHERE su_profile...
>>>                                           ^
>>> QUERY:  SELECT srvc_data           FROM su_profiles WHERE su_profile_pk = p_profile_fk
>>> CONTEXT:  PL/pgSQL function public.check_data_ip(integer,integer,character varying) line 7 at SQL statement
>>> COPY subscribers, line 1: "61   1002    \N      SU_1002 t       \N      4       \N      1       250     0
2015-06-2216:56:27.79333+00     nmsmgr  \N      \N      \N"
 
>>
>> What is in public.check_data_ip(integer,integer,character varying)?
>>
>> Is it a trigger function on subscribers?
> 
> Not quite...
> 
> nms=# \d subscribers
>                         Table "public.subscribers"
> ...
> Check constraints:
>      "chk_su_data_ip" CHECK (check_data_ip(profile_fk, unit_id, data_ip))
> 
> 
> The first line of the check_data_ip() function is the offender:
> 
> SELECT srvc_data INTO data
> FROM su_profiles
> WHERE su_profile_pk = p_profile_fk;
> 
> Hmm, why is this a problem now and not before?
> 
> (Probably the usual reason of code "tightening" as we go forward, or so I'd guess.)
> 
> I don't see any option that mentions "CHECK" in the pg_restore doc in regards to
> possibly turning it off...more research...
> 
> And oh, this hurts. :( From the docs on CHECK constraints:
> 
> PostgreSQL does not support CHECK constraints that reference table data other
> than the new or updated row being checked. While a CHECK constraint that
> violates this rule may appear to work in simple tests, it cannot guarantee
> that the database will not reach a state in which the constraint condition
> is false (due to subsequent changes of the other row(s) involved). This
> would cause a database dump and reload to fail. ...
> 
> It goes on to say a trigger is the right way to do this. So yeah, a "tightening
> of the code" seems to be hitting me because of bad code from the past.
> 
> Problem identified and I'll solve it tomorrow.

Glad it was solved.

However, looking a gift horse in the mouth, did this not error when 
moving from the 9.6.5 instance to the first 12.2 instance?

Or

Was the constraint added in the first 12.2 instance?

> 
> Thank you so much Adrian for helping me to figure this out!
> 
> Kevin


-- 
Adrian Klaver
adrian.klaver@aklaver.com



RE: pg_restore restores out of order

От
Kevin Brannen
Дата:
>Adrian Klaver wrote:
>> Kevin Brannen wrote:
>> It goes on to say a trigger is the right way to do this. So yeah, a
>> "tightening of the code" seems to be hitting me because of bad code from the past.
>>
>> Problem identified and I'll solve it tomorrow.
>
>Glad it was solved.
>
>However, looking a gift horse in the mouth, did this not error when moving from the 9.6.5 instance to the first 12.2
instance?
>
>Or
>
>Was the constraint added in the first 12.2 instance?

This is the first instance. :) I've now got our entire code base working
on the latest version of Perl and Postgresql (they have to go at the same
time because of DBI/DBD) and I'm going thru and making sure everything
works before committing to our integration branch for further testing. I've
solved a few other upgrade issues like some names changing in pg_catalog
and moving to the community RPMs instead of compiling our own. The problem
in this thread has been the only hard one and fortunately for me it's
the last (or I think it is).

Unless we have a significant reason, upgrading is hard enough we only
upgrade about every other year, or that's my plan going forward. Waiting
3 years like we did this time makes for a larger jump than I'd like.

Thanks again for the help!
Kevin

.
This e-mail transmission, and any documents, files or previous e-mail messages attached to it, may contain confidential
information.If you are not the intended recipient, or a person responsible for delivering it to the intended recipient,
youare hereby notified that any disclosure, distribution, review, copy or use of any of the information contained in or
attachedto this message is STRICTLY PROHIBITED. If you have received this transmission in error, please immediately
notifyus by reply e-mail, and destroy the original transmission and its attachments without reading them or saving them
todisk. Thank you.