Обсуждение: Re: pg_upgrade from 9.3 to 9.4 fails

Поиск
Список
Период
Сортировка

Re: pg_upgrade from 9.3 to 9.4 fails

От
"Guo, Yun"
Дата:


From: Yun <yguo@cvent.com>
Date: Monday, October 5, 2015 at 12:19 PM
To: "pgsql-admin-owner@postgresql.org" <pgsql-admin-owner@postgresql.org>
Subject: pg_upgrade from 9.3 to 9.4 fails

Hi,

I’m trying to use pg_upgrade to upgrade 9.3 to 9.4. And I observed weird behavior, that if I have the old 9.3 server up and running the check would pass, but if I shut it down it would fail.  Any idea why would it behave this way? Is there a workaround for this issue?

With old 9.3 running:
-bash-4.1$ ps -ef | grep pg.old
postgres 16243     1  0 16:12 pts/0    00:00:00 /usr/pgsql-9.3/bin/postgres -D /data/pg.old
postgres 16261 15160  0 16:13 pts/0    00:00:00 grep pg.old
-bash-4.1$ pg_upgrade -b /usr/pgsql-9.3/bin -B /usr/pgsql-9.4/bin -d /data/pg.old -D /data/pg --check

*failure*
Consult the last few lines of "pg_upgrade_server.log" for
the probable cause of the failure.
Performing Consistency Checks on Old Live Server
------------------------------------------------
Checking cluster versions                                   ok
Checking database user is a superuser                       ok
Checking database connection settings                       ok
Checking for prepared transactions                          ok
Checking for reg* system OID user data types                ok
Checking for contrib/isn with bigint-passing mismatch       ok
Checking for invalid "line" user columns                    ok
Checking for presence of required libraries                 ok
Checking database user is a superuser                       ok
Checking for prepared transactions                          ok

*Clusters are compatible*

When 9.3 is shut down:

-bash-4.1$ /usr/pgsql-9.3/bin/pg_ctl stop -D /data/pg.old
waiting for server to shut down.... done
server stopped
-bash-4.1$ pg_upgrade -b /usr/pgsql-9.3/bin -B /usr/pgsql-9.4/bin -d /data/pg.old -D /data/pg --check
Performing Consistency Checks
-----------------------------
Checking cluster versions                                   ok
SQL command failed
CREATE TEMPORARY TABLE info_rels (reloid) AS SELECT c.oid FROM pg_catalog.pg_class c JOIN pg_catalog.pg_namespace n   ON c.relnamespace = n.oid LEFT OUTER JOIN pg_catalog.pg_index i   ON c.oid = i.indexrelid WHERE relkind IN ('r', 'm', 'i', 'S') AND  i.indisvalid IS DISTINCT FROM false AND  i.indisready IS DISTINCT FROM false AND   ((n.nspname !~ '^pg_temp_' AND     n.nspname !~ '^pg_toast_temp_' AND     n.nspname NOT IN ('pg_catalog', 'information_schema', 'binary_upgrade', 'pg_toast') AND c.oid >= 16384)   OR (n.nspname = 'pg_catalog' AND     relname IN ('pg_largeobject', 'pg_largeobject_loid_pn_index', 'pg_largeobject_metadata', 'pg_largeobject_metadata_oid_index') ));
ERROR:  cache lookup failed for type 1670699

Failure, exiting



Re: pg_upgrade from 9.3 to 9.4 fails

От
Bruce Momjian
Дата:
On Mon, Oct  5, 2015 at 05:05:46PM +0000, Guo, Yun wrote:
> -bash-4.1$ /usr/pgsql-9.3/bin/pg_ctl stop -D /data/pg.old
> waiting for server to shut down.... done
> server stopped
> -bash-4.1$ pg_upgrade -b /usr/pgsql-9.3/bin -B /usr/pgsql-9.4/bin -d /data/
> pg.old -D /data/pg --check
> Performing Consistency Checks
> -----------------------------
> Checking cluster versions                                   ok
> SQL command failed
> CREATE TEMPORARY TABLE info_rels (reloid) AS SELECT c.oid FROM
> pg_catalog.pg_class c JOIN pg_catalog.pg_namespace n   ON c.relnamespace =
> n.oid LEFT OUTER JOIN pg_catalog.pg_index i   ON c.oid = i.indexrelid WHERE
> relkind IN ('r', 'm', 'i', 'S') AND  i.indisvalid IS DISTINCT FROM false AND
>  i.indisready IS DISTINCT FROM false AND   ((n.nspname !~ '^pg_temp_' AND
> n.nspname !~ '^pg_toast_temp_' AND     n.nspname NOT IN ('pg_catalog',
> 'information_schema', 'binary_upgrade', 'pg_toast') AND c.oid >= 16384)   OR
> (n.nspname = 'pg_catalog' AND     relname IN ('pg_largeobject',
> 'pg_largeobject_loid_pn_index', 'pg_largeobject_metadata',
> 'pg_largeobject_metadata_oid_index') ));
> ERROR:  cache lookup failed for type 1670699

Wow, that is weird.  Can you run this query on the old cluster and show
us the output?

    SELECT * FROM pg_type WHERE oid = 1670699;

This query doesn't even query pg_type, so it must be some internal use
of pg_type.

The reason check doesn't show the failure is that only a non-check run
collects pg_class.oid values, but we never expect that to fail so we
don't test it in check mode.

My guess is that something is messed up in your system catalogs.  Can
you try running this query in each old database and see if it fails.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Roman grave inscription                             +


Re: pg_upgrade from 9.3 to 9.4 fails

От
"Guo, Yun"
Дата:

On 10/5/15, 1:24 PM, "Bruce Momjian" <bruce@momjian.us> wrote:

>On Mon, Oct  5, 2015 at 05:05:46PM +0000, Guo, Yun wrote:
>> -bash-4.1$ /usr/pgsql-9.3/bin/pg_ctl stop -D /data/pg.old
>> waiting for server to shut down.... done
>> server stopped
>> -bash-4.1$ pg_upgrade -b /usr/pgsql-9.3/bin -B /usr/pgsql-9.4/bin -d
>>/data/
>> pg.old -D /data/pg --check
>> Performing Consistency Checks
>> -----------------------------
>> Checking cluster versions                                   ok
>> SQL command failed
>> CREATE TEMPORARY TABLE info_rels (reloid) AS SELECT c.oid FROM
>> pg_catalog.pg_class c JOIN pg_catalog.pg_namespace n   ON
>>c.relnamespace =
>> n.oid LEFT OUTER JOIN pg_catalog.pg_index i   ON c.oid = i.indexrelid
>>WHERE
>> relkind IN ('r', 'm', 'i', 'S') AND  i.indisvalid IS DISTINCT FROM
>>false AND
>>  i.indisready IS DISTINCT FROM false AND   ((n.nspname !~ '^pg_temp_'
>>AND
>> n.nspname !~ '^pg_toast_temp_' AND     n.nspname NOT IN ('pg_catalog',
>> 'information_schema', 'binary_upgrade', 'pg_toast') AND c.oid >= 16384)
>>  OR
>> (n.nspname = 'pg_catalog' AND     relname IN ('pg_largeobject',
>> 'pg_largeobject_loid_pn_index', 'pg_largeobject_metadata',
>> 'pg_largeobject_metadata_oid_index') ));
>> ERROR:  cache lookup failed for type 1670699
>
>Wow, that is weird.  Can you run this query on the old cluster and show
>us the output?
>
>    SELECT * FROM pg_type WHERE oid = 1670699;

This turns out to be empty in all of the databases:
postgres=# SELECT * FROM pg_type WHERE oid = 1670699;
 typname | typnamespace | typowner | typlen | typbyval | typtype |
typcategory | typispreferred | typisdefined | typdelim | typrelid | typ
elem | typarray | typinput | typoutput | typreceive | typsend | typmodin |
typmodout | typanalyze | typalign | typstorage | typnotnull | t
ypbasetype | typtypmod | typndims | typcollation | typdefaultbin |
typdefault | typacl
---------+--------------+----------+--------+----------+---------+---------
----+----------------+--------------+----------+----------+----
-----+----------+----------+-----------+------------+---------+----------+-
----------+------------+----------+------------+------------+--
-----------+-----------+----------+--------------+---------------+---------
---+--------
(0 rows)


>
>This query doesn't even query pg_type, so it must be some internal use
>of pg_type.
>
>The reason check doesn't show the failure is that only a non-check run
>collects pg_class.oid values, but we never expect that to fail so we
>don't test it in check mode.
>
>My guess is that something is messed up in your system catalogs.  Can
>you try running this query in each old database and see if it fails.

If my old server system catalog is messed up is there way to repair it?

>
>--
>  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
>  EnterpriseDB                             http://enterprisedb.com
>
>+ As you are, so once was I. As I am, so you will be. +
>+ Roman grave inscription                             +



Re: pg_upgrade from 9.3 to 9.4 fails

От
Bruce Momjian
Дата:
On Mon, Oct  5, 2015 at 06:00:27PM +0000, Guo, Yun wrote:
> >Wow, that is weird.  Can you run this query on the old cluster and show
> >us the output?
> >
> >    SELECT * FROM pg_type WHERE oid = 1670699;
>
> This turns out to be empty in all of the databases:
> postgres=# SELECT * FROM pg_type WHERE oid = 1670699;
>  typname | typnamespace | typowner | typlen | typbyval | typtype |
> typcategory | typispreferred | typisdefined | typdelim | typrelid | typ
> elem | typarray | typinput | typoutput | typreceive | typsend | typmodin |
> typmodout | typanalyze | typalign | typstorage | typnotnull | t
> ypbasetype | typtypmod | typndims | typcollation | typdefaultbin |
> typdefault | typacl
> ---------+--------------+----------+--------+----------+---------+---------
> ----+----------------+--------------+----------+----------+----
> -----+----------+----------+-----------+------------+---------+----------+-
> ----------+------------+----------+------------+------------+--
> -----------+-----------+----------+--------------+---------------+---------
> ---+--------
> (0 rows)

OK, try running the error query in all the databases then.

> >This query doesn't even query pg_type, so it must be some internal use
> >of pg_type.
> >
> >The reason check doesn't show the failure is that only a non-check run
> >collects pg_class.oid values, but we never expect that to fail so we
> >don't test it in check mode.
> >
> >My guess is that something is messed up in your system catalogs.  Can
> >you try running this query in each old database and see if it fails.
>
> If my old server system catalog is messed up is there way to repair it?

Usually, once we find the cause.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Roman grave inscription                             +


Re: pg_upgrade from 9.3 to 9.4 fails

От
"Guo, Yun"
Дата:

On 10/5/15, 2:02 PM, "Bruce Momjian" <bruce@momjian.us> wrote:

>On Mon, Oct  5, 2015 at 06:00:27PM +0000, Guo, Yun wrote:
>> >Wow, that is weird.  Can you run this query on the old cluster and show
>> >us the output?
>> >
>> >    SELECT * FROM pg_type WHERE oid = 1670699;
>>
>> This turns out to be empty in all of the databases:
>> postgres=# SELECT * FROM pg_type WHERE oid = 1670699;
>>  typname | typnamespace | typowner | typlen | typbyval | typtype |
>> typcategory | typispreferred | typisdefined | typdelim | typrelid | typ
>> elem | typarray | typinput | typoutput | typreceive | typsend |
>>typmodin |
>> typmodout | typanalyze | typalign | typstorage | typnotnull | t
>> ypbasetype | typtypmod | typndims | typcollation | typdefaultbin |
>> typdefault | typacl
>>
>>---------+--------------+----------+--------+----------+---------+-------
>>--
>> ----+----------------+--------------+----------+----------+----
>>
>>-----+----------+----------+-----------+------------+---------+----------
>>+-
>> ----------+------------+----------+------------+------------+--
>>
>>-----------+-----------+----------+--------------+---------------+-------
>>--
>> ---+--------
>> (0 rows)
>
>OK, try running the error query in all the databases then.

I tried it all databases. None of them has oid 1670699 in pg_type.

>
>> >This query doesn't even query pg_type, so it must be some internal use
>> >of pg_type.
>> >
>> >The reason check doesn't show the failure is that only a non-check run
>> >collects pg_class.oid values, but we never expect that to fail so we
>> >don't test it in check mode.
>> >
>> >My guess is that something is messed up in your system catalogs.  Can
>> >you try running this query in each old database and see if it fails.
>>
>> If my old server system catalog is messed up is there way to repair it?
>
>Usually, once we find the cause.
>
>--
>  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
>  EnterpriseDB                             http://enterprisedb.com
>
>+ As you are, so once was I. As I am, so you will be. +
>+ Roman grave inscription                             +



Re: pg_upgrade from 9.3 to 9.4 fails

От
Bruce Momjian
Дата:
On Mon, Oct  5, 2015 at 06:04:13PM +0000, Guo, Yun wrote:
>
>
> On 10/5/15, 2:02 PM, "Bruce Momjian" <bruce@momjian.us> wrote:
>
> >On Mon, Oct  5, 2015 at 06:00:27PM +0000, Guo, Yun wrote:
> >> >Wow, that is weird.  Can you run this query on the old cluster and show
> >> >us the output?
> >> >
> >> >    SELECT * FROM pg_type WHERE oid = 1670699;
> >OK, try running the error query in all the databases then.
>
> I tried it all databases. None of them has oid 1670699 in pg_type.

Yes, I know that.  I want you to run the query that generated the error
in each database and figure out which database has the problem, see
below.  My guess is that database references a datatype that doesn't
exist anymore, for some reason.

---------------------------------------------------------------------------

CREATE TEMPORARY TABLE info_rels (reloid) AS SELECT c.oid FROM
pg_catalog.pg_class c JOIN pg_catalog.pg_namespace n   ON c.relnamespace =
n.oid LEFT OUTER JOIN pg_catalog.pg_index i   ON c.oid = i.indexrelid WHERE
relkind IN ('r', 'm', 'i', 'S') AND  i.indisvalid IS DISTINCT FROM false AND
 i.indisready IS DISTINCT FROM false AND   ((n.nspname !~ '^pg_temp_' AND
n.nspname !~ '^pg_toast_temp_' AND     n.nspname NOT IN ('pg_catalog',
'information_schema', 'binary_upgrade', 'pg_toast') AND c.oid >= 16384)   OR
(n.nspname = 'pg_catalog' AND     relname IN ('pg_largeobject',
'pg_largeobject_loid_pn_index', 'pg_largeobject_metadata',
'pg_largeobject_metadata_oid_index') ));

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Roman grave inscription                             +


Re: pg_upgrade from 9.3 to 9.4 fails

От
"Guo, Yun"
Дата:

On 10/5/15, 2:16 PM, "Bruce Momjian" <bruce@momjian.us> wrote:

>On Mon, Oct  5, 2015 at 06:04:13PM +0000, Guo, Yun wrote:
>>
>>
>> On 10/5/15, 2:02 PM, "Bruce Momjian" <bruce@momjian.us> wrote:
>>
>> >On Mon, Oct  5, 2015 at 06:00:27PM +0000, Guo, Yun wrote:
>> >> >Wow, that is weird.  Can you run this query on the old cluster and
>>show
>> >> >us the output?
>> >> >
>> >> >    SELECT * FROM pg_type WHERE oid = 1670699;
>> >OK, try running the error query in all the databases then.
>>
>> I tried it all databases. None of them has oid 1670699 in pg_type.
>
>Yes, I know that.  I want you to run the query that generated the error
>in each database and figure out which database has the problem, see
>below.  My guess is that database references a datatype that doesn't
>exist anymore, for some reason.
>
>--------------------------------------------------------------------------
>-
>
>CREATE TEMPORARY TABLE info_rels (reloid) AS SELECT c.oid FROM
>pg_catalog.pg_class c JOIN pg_catalog.pg_namespace n   ON c.relnamespace =
>n.oid LEFT OUTER JOIN pg_catalog.pg_index i   ON c.oid = i.indexrelid
>WHERE
>relkind IN ('r', 'm', 'i', 'S') AND  i.indisvalid IS DISTINCT FROM false
>AND
> i.indisready IS DISTINCT FROM false AND   ((n.nspname !~ '^pg_temp_' AND
>n.nspname !~ '^pg_toast_temp_' AND     n.nspname NOT IN ('pg_catalog',
>'information_schema', 'binary_upgrade', 'pg_toast') AND c.oid >= 16384)
>OR
>(n.nspname = 'pg_catalog' AND     relname IN ('pg_largeobject',
>'pg_largeobject_loid_pn_index', 'pg_largeobject_metadata',
>'pg_largeobject_metadata_oid_index') ));

Sorry I misunderstood you instruction. I tried this query in all databases
and it executes without any error.
I guess that¹s the reason when I have the old server running the check
could pass.
However, somehow when the old server is hut down it would fail.

>
>--
>  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
>  EnterpriseDB                             http://enterprisedb.com
>
>+ As you are, so once was I. As I am, so you will be. +
>+ Roman grave inscription                             +



Re: pg_upgrade from 9.3 to 9.4 fails

От
Bruce Momjian
Дата:
On Mon, Oct  5, 2015 at 06:23:43PM +0000, Guo, Yun wrote:
> >CREATE TEMPORARY TABLE info_rels (reloid) AS SELECT c.oid FROM
> >pg_catalog.pg_class c JOIN pg_catalog.pg_namespace n   ON c.relnamespace =
> >n.oid LEFT OUTER JOIN pg_catalog.pg_index i   ON c.oid = i.indexrelid
> >WHERE
> >relkind IN ('r', 'm', 'i', 'S') AND  i.indisvalid IS DISTINCT FROM false
> >AND
> > i.indisready IS DISTINCT FROM false AND   ((n.nspname !~ '^pg_temp_' AND
> >n.nspname !~ '^pg_toast_temp_' AND     n.nspname NOT IN ('pg_catalog',
> >'information_schema', 'binary_upgrade', 'pg_toast') AND c.oid >= 16384)
> >OR
> >(n.nspname = 'pg_catalog' AND     relname IN ('pg_largeobject',
> >'pg_largeobject_loid_pn_index', 'pg_largeobject_metadata',
> >'pg_largeobject_metadata_oid_index') ));
>
> Sorry I misunderstood you instruction. I tried this query in all databases
> and it executes without any error.
> I guess that¹s the reason when I have the old server running the check
> could pass.
> However, somehow when the old server is hut down it would fail.

Wow, that is really odd.  Can you post the contents of
"pg_upgrade_server.log" after the failure?  This file is mentioned in
the error message.  You can email it to me privately if it is large.

My only guess is that when the server is running, you have started it
with some special options that pg_upgrade doesn't know about.  If the
server is down, pg_upgrade has to start it to check it, and that might
be causing the problem.  It would also be good to know if the failure is
with the old or new cluster.

I have never seen a failure like this.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Roman grave inscription                             +


Re: pg_upgrade from 9.3 to 9.4 fails

От
"Guo, Yun"
Дата:

On 10/5/15, 3:02 PM, "Bruce Momjian" <bruce@momjian.us> wrote:

>On Mon, Oct  5, 2015 at 06:23:43PM +0000, Guo, Yun wrote:
>> >CREATE TEMPORARY TABLE info_rels (reloid) AS SELECT c.oid FROM
>> >pg_catalog.pg_class c JOIN pg_catalog.pg_namespace n   ON
>>c.relnamespace =
>> >n.oid LEFT OUTER JOIN pg_catalog.pg_index i   ON c.oid = i.indexrelid
>> >WHERE
>> >relkind IN ('r', 'm', 'i', 'S') AND  i.indisvalid IS DISTINCT FROM
>>false
>> >AND
>> > i.indisready IS DISTINCT FROM false AND   ((n.nspname !~ '^pg_temp_'
>>AND
>> >n.nspname !~ '^pg_toast_temp_' AND     n.nspname NOT IN ('pg_catalog',
>> >'information_schema', 'binary_upgrade', 'pg_toast') AND c.oid >= 16384)
>> >OR
>> >(n.nspname = 'pg_catalog' AND     relname IN ('pg_largeobject',
>> >'pg_largeobject_loid_pn_index', 'pg_largeobject_metadata',
>> >'pg_largeobject_metadata_oid_index') ));
>> 
>> Sorry I misunderstood you instruction. I tried this query in all
>>databases
>> and it executes without any error.
>> I guess that¹s the reason when I have the old server running the check
>> could pass. 
>> However, somehow when the old server is hut down it would fail.
>
>Wow, that is really odd.  Can you post the contents of
>"pg_upgrade_server.log" after the failure?  This file is mentioned in
>the error message.  You can email it to me privately if it is large.
>
>My only guess is that when the server is running, you have started it
>with some special options that pg_upgrade doesn't know about.  If the
>server is down, pg_upgrade has to start it to check it, and that might
>be causing the problem.  It would also be good to know if the failure is
>with the old or new cluster.

pg_upgrade_server.log outputs below message:

-----------------------------------------------------------------
  pg_upgrade run on Mon Oct  5 19:17:17 2015
-----------------------------------------------------------------

command: "/usr/pgsql-9.3/bin/pg_ctl" -w -l "pg_upgrade_server.log" -D
"/data/pg.old" -o "-p 50432 -b  -c listen_addresses='' -c
unix_socket_permissions=0700 -c unix_socket_directories='/var/lib/pgsql'"
start >> "pg_upgrade_server.log" 2>&1
waiting for server to start....<2015-10-05 15:17:17.802 EDT>
user=,db=,host=,processid=18761 LOG:  redirecting log output to logging
collector process
<2015-10-05 15:17:17.802 EDT> user=,db=,host=,processid=18761 HINT:
Future log output will appear in directory "pg_log".
 done
server started


command: "/usr/pgsql-9.3/bin/pg_ctl" -w -D "/data/pg.old" -o "" -m fast
stop >> "pg_upgrade_server.log" 2>&1
waiting for server to shut down.... done
server stopped

I tried to execute the start command on my own, it didn’t throw error but
the server is not accessible. I’m guessing it’s because of the
listen_addresses=‘’ parameter.


Below are the log entries during that start command:

<2015-10-05 15:42:50.506 EDT> user=,db=,host=,processid=18917 LOG:
database system was shut down at 2015-10-05 15:42:30 EDT
<2015-10-05 15:42:50.510 EDT> user=,db=,host=,processid=18915 LOG:
database system is ready to accept connections


>
>I have never seen a failure like this.
>
>-- 
>  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
>  EnterpriseDB                             http://enterprisedb.com
>
>+ As you are, so once was I. As I am, so you will be. +
>+ Roman grave inscription                             +


Re: pg_upgrade from 9.3 to 9.4 fails

От
Bruce Momjian
Дата:
On Mon, Oct  5, 2015 at 07:44:51PM +0000, Guo, Yun wrote:
> >My only guess is that when the server is running, you have started it
> >with some special options that pg_upgrade doesn't know about.  If the
> >server is down, pg_upgrade has to start it to check it, and that might
> >be causing the problem.  It would also be good to know if the failure is
> >with the old or new cluster.
>
> pg_upgrade_server.log outputs below message:

So you get the error:

    ERROR:  cache lookup failed for type 1670699

from pg_upgrade starting the server, but not when the server is already
running?  I can't even guess what is wrong.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Roman grave inscription                             +