Обсуждение: Issue upgrading from 9.5 to 13 with pg_upgrade: "connection to database failed: FATAL: database "template1" does not exist"
I’m running CentOS 8 on an EC2 instance and attempting to upgrade a 9.5 database to 13 using pg_upgrade. Both are runningon the same box and pass initial tests but it fails during the later part of the process. --- bash-4.4$ /usr/pgsql-13/bin/pg_upgrade --old-bindir /usr/pgsql-9.5/bin --new-bindir /usr/pgsql-13/bin --old-datadir /var/lib/pgsql/9.5/data--new-datadir /var/lib/pgsql/13/data --old-port=54320 --new-port=5432 --socketdir=/var/run/postgresql/ Performing Consistency Checks ----------------------------- Checking cluster versions ok Checking database user is the install user ok Checking database connection settings ok Checking for prepared transactions ok Checking for reg* data types in user tables ok Checking for contrib/isn with bigint-passing mismatch ok Checking for tables WITH OIDS ok Checking for invalid "sql_identifier" user columns ok Checking for invalid "unknown" user columns ok Checking for roles starting with "pg_" ok Creating dump of global objects ok Creating dump of database schemas ok connection to database failed: FATAL: database "template1" does not exist could not connect to target postmaster started with the command: "/usr/pgsql-13/bin/pg_ctl" -w -l "pg_upgrade_server.log" -D "/var/lib/pgsql/13/data" -o "-p 5432 -b -c synchronous_commit=off-c fsync=off -c full_page_writes=off -c vacuum_defer_cleanup_age=0 -c listen_addresses='' -c unix_socket_permissions=0700-c unix_socket_directories='/var/run/postgresql'" start Failure, exiting --- When I manually run that command it works, although when I attempt to connect to the ‘postgres’ database on it, it complainsit doesn’t exist. I can’t use the dump/restore method to upgrade this server as the data in question is fairly massive and in my testing ittook 45 hours to complete. Any help appreciated.
On 11/13/20 6:32 AM, Jeremy Wilson wrote: > I’m running CentOS 8 on an EC2 instance and attempting to upgrade a 9.5 database to 13 using pg_upgrade. Both are runningon the same box and pass initial tests but it fails during the later part of the process. > > --- > > bash-4.4$ /usr/pgsql-13/bin/pg_upgrade --old-bindir /usr/pgsql-9.5/bin --new-bindir /usr/pgsql-13/bin --old-datadir /var/lib/pgsql/9.5/data--new-datadir /var/lib/pgsql/13/data --old-port=54320 --new-port=5432 --socketdir=/var/run/postgresql/ > Performing Consistency Checks > ----------------------------- > Checking cluster versions ok > Checking database user is the install user ok > Checking database connection settings ok > Checking for prepared transactions ok > Checking for reg* data types in user tables ok > Checking for contrib/isn with bigint-passing mismatch ok > Checking for tables WITH OIDS ok > Checking for invalid "sql_identifier" user columns ok > Checking for invalid "unknown" user columns ok > Checking for roles starting with "pg_" ok > Creating dump of global objects ok > Creating dump of database schemas > ok > > connection to database failed: FATAL: database "template1" does not exist > > could not connect to target postmaster started with the command: > "/usr/pgsql-13/bin/pg_ctl" -w -l "pg_upgrade_server.log" -D "/var/lib/pgsql/13/data" -o "-p 5432 -b -c synchronous_commit=off-c fsync=off -c full_page_writes=off -c vacuum_defer_cleanup_age=0 -c listen_addresses='' -c unix_socket_permissions=0700-c unix_socket_directories='/var/run/postgresql'" start > Failure, exiting > > --- > > When I manually run that command it works, although when I attempt to connect to the ‘postgres’ database on it, it complainsit doesn’t exist. To me it seems the initdb for the 13 instance did not complete successfully. Have you tried clearing /var/lib/pgsql/13/data and doing the init over again? If you do try it monitor the output carefully. > > I can’t use the dump/restore method to upgrade this server as the data in question is fairly massive and in my testingit took 45 hours to complete. Any help appreciated. > > > > -- Adrian Klaver adrian.klaver@aklaver.com
> On Nov 13, 2020, at 9:58 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote: > > To me it seems the initdb for the 13 instance did not complete successfully. Have you tried clearing /var/lib/pgsql/13/dataand doing the init over again? If you do try it monitor the output carefully. here’s the complete process: bash-4.4$ /usr/pgsql-13/bin/initdb -D /var/lib/pgsql/13/data/ --locale=en_US.UTF-8 The files belonging to this database system will be owned by user "postgres". This user must also own the server process. The database cluster will be initialized with locale "en_US.UTF-8". The default database encoding has accordingly been set to "UTF8". The default text search configuration will be set to "english". Data page checksums are disabled. fixing permissions on existing directory /var/lib/pgsql/13/data ... ok creating subdirectories ... ok selecting dynamic shared memory implementation ... posix selecting default max_connections ... 100 selecting default shared_buffers ... 128MB selecting default time zone ... America/Toronto creating configuration files ... ok running bootstrap script ... ok performing post-bootstrap initialization ... ok syncing data to disk ... ok initdb: warning: enabling "trust" authentication for local connections You can change this by editing pg_hba.conf or using the option -A, or --auth-local and --auth-host, the next time you run initdb. Success. You can now start the database server using: /usr/pgsql-13/bin/pg_ctl -D /var/lib/pgsql/13/data/ -l logfile start bash-4.4$ /usr/pgsql-13/bin/pg_ctl -D /var/lib/pgsql/13/data/ -l logfile start waiting for server to start.... done server started bash-4.4$ psql postgres psql (13.0) Type "help" for help. postgres=# \c template1 You are now connected to database "template1" as user "postgres". template1=# \q bash-4.4$ /usr/pgsql-13/bin/pg_ctl -D /var/lib/pgsql/13/data/ -l logfile stop waiting for server to shut down.... done server stopped bash-4.4$ /usr/pgsql-13/bin/pg_upgrade --old-bindir /usr/pgsql-9.5/bin --new-bindir /usr/pgsql-13/bin --old-datadir /var/lib/pgsql/9.5/data--new-datadir /var/lib/pgsql/13/data --old-port=54320 --new-port=5432 Performing Consistency Checks ----------------------------- Checking cluster versions ok Checking database user is the install user ok Checking database connection settings ok Checking for prepared transactions ok Checking for reg* data types in user tables ok Checking for contrib/isn with bigint-passing mismatch ok Checking for tables WITH OIDS ok Checking for invalid "sql_identifier" user columns ok Checking for invalid "unknown" user columns ok Checking for roles starting with "pg_" ok Creating dump of global objects ok Creating dump of database schemas ok Checking for presence of required libraries ok Checking database user is the install user ok Checking for prepared transactions ok If pg_upgrade fails after this point, you must re-initdb the new cluster before continuing. Performing Upgrade ------------------ Analyzing all rows in the new cluster ok Freezing all rows in the new cluster ok Deleting files from new pg_xact ok Copying old pg_clog to new server ok Setting next transaction ID and epoch for new cluster ok Deleting files from new pg_multixact/offsets ok Copying old pg_multixact/offsets to new server ok Deleting files from new pg_multixact/members ok Copying old pg_multixact/members to new server ok Setting next multixact ID and offset for new cluster ok Resetting WAL archives ok connection to database failed: FATAL: database "template1" does not exist could not connect to target postmaster started with the command: "/usr/pgsql-13/bin/pg_ctl" -w -l "pg_upgrade_server.log" -D "/var/lib/pgsql/13/data" -o "-p 5432 -b -c synchronous_commit=off-c fsync=off -c full_page_writes=off -c vacuum_defer_cleanup_age=0 -c listen_addresses='' -c unix_socket_permissions=0700-c unix_socket_directories='/var/lib/pgsql'" start Failure, exiting
On 11/13/20 7:02 AM, Jeremy Wilson wrote: > > >> On Nov 13, 2020, at 9:58 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote: >> >> To me it seems the initdb for the 13 instance did not complete successfully. Have you tried clearing /var/lib/pgsql/13/dataand doing the init over again? If you do try it monitor the output carefully. > > here’s the complete process: > > bash-4.4$ /usr/pgsql-13/bin/initdb -D /var/lib/pgsql/13/data/ --locale=en_US.UTF-8 > The files belonging to this database system will be owned by user "postgres". > This user must also own the server process. > > The database cluster will be initialized with locale "en_US.UTF-8". > The default database encoding has accordingly been set to "UTF8". > The default text search configuration will be set to "english". > > Data page checksums are disabled. > > fixing permissions on existing directory /var/lib/pgsql/13/data ... ok > creating subdirectories ... ok > selecting dynamic shared memory implementation ... posix > selecting default max_connections ... 100 > selecting default shared_buffers ... 128MB > selecting default time zone ... America/Toronto > creating configuration files ... ok > running bootstrap script ... ok > performing post-bootstrap initialization ... ok > syncing data to disk ... ok > > initdb: warning: enabling "trust" authentication for local connections > You can change this by editing pg_hba.conf or using the option -A, or > --auth-local and --auth-host, the next time you run initdb. > > Success. You can now start the database server using: > > /usr/pgsql-13/bin/pg_ctl -D /var/lib/pgsql/13/data/ -l logfile start > > bash-4.4$ /usr/pgsql-13/bin/pg_ctl -D /var/lib/pgsql/13/data/ -l logfile start > waiting for server to start.... done > server started > bash-4.4$ psql postgres > psql (13.0) > Type "help" for help. > > postgres=# \c template1 > You are now connected to database "template1" as user "postgres". > template1=# \q Well that blows the initdb theory out of the water. > bash-4.4$ /usr/pgsql-13/bin/pg_ctl -D /var/lib/pgsql/13/data/ -l logfile stop > waiting for server to shut down.... done > server stopped > bash-4.4$ /usr/pgsql-13/bin/pg_upgrade --old-bindir /usr/pgsql-9.5/bin --new-bindir /usr/pgsql-13/bin --old-datadir /var/lib/pgsql/9.5/data--new-datadir /var/lib/pgsql/13/data --old-port=54320 --new-port=5432 In your previous post you had --socketdir=/var/run/postgresql/. Did you change that or is it missing? > > connection to database failed: FATAL: database "template1" does not exist > > could not connect to target postmaster started with the command: > "/usr/pgsql-13/bin/pg_ctl" -w -l "pg_upgrade_server.log" -D "/var/lib/pgsql/13/data" -o "-p 5432 -b -c synchronous_commit=off-c fsync=off -c full_page_writes=off -c vacuum_defer_cleanup_age=0 -c listen_addresses='' -c unix_socket_permissions=0700-c unix_socket_directories='/var/lib/pgsql'" start Where is Postgres putting its sockets? > Failure, exiting > -- Adrian Klaver adrian.klaver@aklaver.com
> On Nov 13, 2020, at 10:09 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote: > > In your previous post you had --socketdir=/var/run/postgresql/. Did you change that or is it missing? Sorry, here it is with the socket directory specified. bash-4.4$ rm -r 13/data/* bash-4.4$ bash-4.4$ /usr/pgsql-13/bin/initdb -D /var/lib/pgsql/13/data/ --locale=en_US.UTF-8 The files belonging to this database system will be owned by user "postgres". This user must also own the server process. The database cluster will be initialized with locale "en_US.UTF-8". The default database encoding has accordingly been set to "UTF8". The default text search configuration will be set to "english". Data page checksums are disabled. fixing permissions on existing directory /var/lib/pgsql/13/data ... ok creating subdirectories ... ok selecting dynamic shared memory implementation ... posix selecting default max_connections ... 100 selecting default shared_buffers ... 128MB selecting default time zone ... America/Toronto creating configuration files ... ok running bootstrap script ... ok performing post-bootstrap initialization ... ok syncing data to disk ... ok initdb: warning: enabling "trust" authentication for local connections You can change this by editing pg_hba.conf or using the option -A, or --auth-local and --auth-host, the next time you run initdb. Success. You can now start the database server using: /usr/pgsql-13/bin/pg_ctl -D /var/lib/pgsql/13/data/ -l logfile start bash-4.4$ bash-4.4$ /usr/pgsql-13/bin/pg_ctl -D /var/lib/pgsql/13/data/ -l logfile start waiting for server to start.... done server started bash-4.4$ psql template1 psql (13.0) Type "help" for help. template1=# \q bash-4.4$ /usr/pgsql-13/bin/pg_ctl -D /var/lib/pgsql/13/data/ -l logfile stop waiting for server to shut down.... done server stopped bash-4.4$ /usr/pgsql-13/bin/pg_upgrade --old-bindir /usr/pgsql-9.5/bin --new-bindir /usr/pgsql-13/bin --old-datadir /var/lib/pgsql/9.5/data--new-datadir /var/lib/pgsql/13/data --old-port 54320 --new-port 5432 --socketdir /var/run/postgresql Performing Consistency Checks ----------------------------- Checking cluster versions ok Checking database user is the install user ok Checking database connection settings ok Checking for prepared transactions ok Checking for reg* data types in user tables ok Checking for contrib/isn with bigint-passing mismatch ok Checking for tables WITH OIDS ok Checking for invalid "sql_identifier" user columns ok Checking for invalid "unknown" user columns ok Checking for roles starting with "pg_" ok Creating dump of global objects ok Creating dump of database schemas ok Checking for presence of required libraries ok Checking database user is the install user ok Checking for prepared transactions ok If pg_upgrade fails after this point, you must re-initdb the new cluster before continuing. Performing Upgrade ------------------ Analyzing all rows in the new cluster ok Freezing all rows in the new cluster ok Deleting files from new pg_xact ok Copying old pg_clog to new server ok Setting next transaction ID and epoch for new cluster ok Deleting files from new pg_multixact/offsets ok Copying old pg_multixact/offsets to new server ok Deleting files from new pg_multixact/members ok Copying old pg_multixact/members to new server ok Setting next multixact ID and offset for new cluster ok Resetting WAL archives ok connection to database failed: FATAL: database "template1" does not exist could not connect to target postmaster started with the command: "/usr/pgsql-13/bin/pg_ctl" -w -l "pg_upgrade_server.log" -D "/var/lib/pgsql/13/data" -o "-p 5432 -b -c synchronous_commit=off-c fsync=off -c full_page_writes=off -c vacuum_defer_cleanup_age=0 -c listen_addresses='' -c unix_socket_permissions=0700-c unix_socket_directories='/var/run/postgresql'" start Failure, exiting
Jeremy Wilson <jwilson@clover.co> writes: >> On Nov 13, 2020, at 10:09 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote: >> In your previous post you had --socketdir=/var/run/postgresql/. Did you change that or is it missing? > Sorry, here it is with the socket directory specified. An incorrect socket-directory setting would lead to something like "could not connect to server: No such file or directory", not to a specific complaint about a database not being present. Unless ... could it be that there is another PG server active on the machine, whose cluster lacks a "template1" database? Seems unlikely, but you might try confirming with "ps auxww | grep post" or the like. I also wonder if the cluster you're trying to upgrade from has a "template1" database. Not sure if anyone has ever tested whether pg_upgrade can cope with the lack of one. regards, tom lane
> On Nov 13, 2020, at 10:23 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Unless ... could it be that there is another PG server active on the > machine, whose cluster lacks a "template1" database? Seems unlikely, > but you might try confirming with "ps auxww | grep post" or the like. This is a test environment so only the two databases are on it and neither is running. pg_upgrade checks if either is running. > I also wonder if the cluster you're trying to upgrade from has a > "template1" database. Not sure if anyone has ever tested whether > pg_upgrade can cope with the lack of one. It’s there: $ psql -p 54320 template1 psql (13.0, server 9.5.23) Type "help" for help. template1=#
Jeremy Wilson <jwilson@clover.co> writes: > [ no soap on either of my theories ] Hmph. We know that 9.5 -> 13 pg_upgrade works in simple scenarios, because the buildfarm tests that every day. So there has to be something out of the ordinary about your setup. Any unusual extensions, pg_hba.conf configuration, etc? regards, tom lane
On 11/13/20 7:28 AM, Jeremy Wilson wrote: > > >> On Nov 13, 2020, at 10:23 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> >> Unless ... could it be that there is another PG server active on the >> machine, whose cluster lacks a "template1" database? Seems unlikely, >> but you might try confirming with "ps auxww | grep post" or the like. > > This is a test environment so only the two databases are on it and neither is running. pg_upgrade checks if either isrunning. When you manually run the pg_upgrade pg_ctl script the server starts but you cannot connect to any database in it correct? What does pg_upgrade_server.log show when you do above? > >> I also wonder if the cluster you're trying to upgrade from has a >> "template1" database. Not sure if anyone has ever tested whether >> pg_upgrade can cope with the lack of one. > > > It’s there: > > $ psql -p 54320 template1 > psql (13.0, server 9.5.23) > Type "help" for help. > > template1=# > > > -- Adrian Klaver adrian.klaver@aklaver.com
> On Nov 13, 2020, at 10:46 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Hmph. We know that 9.5 -> 13 pg_upgrade works in simple scenarios, > because the buildfarm tests that every day. So there has to be > something out of the ordinary about your setup. Any unusual > extensions, pg_hba.conf configuration, etc? We have postgis installed, here’s a list of most of the extensions: master=# \dx List of installed extensions Name | Version | Schema | Description --------------------+---------+------------+--------------------------------------------------------------------- btree_gin | 1.0 | public | support for indexing common datatypes in GIN cube | 1.0 | public | data type for multidimensional cubes dblink | 1.1 | public | connect to other PostgreSQL databases from within a database earthdistance | 1.0 | public | calculate great-circle distances on the surface of the Earth pageinspect | 1.3 | public | inspect the contents of database pages at a low level pg_buffercache | 1.1 | public | examine the shared buffer cache pg_stat_statements | 1.3 | public | track execution statistics of all SQL statements executed pg_trgm | 1.1 | public | text similarity measurement and index searching based on trigrams plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language postgis | 3.0.2 | public | PostGIS geometry, geography, and raster spatial types and functions postgis_raster | 3.0.2 | public | PostGIS raster types and functions postgres_fdw | 1.0 | public | foreign-data wrapper for remote PostgreSQL servers unaccent | 1.0 | public | text search dictionary that removes accents (13 rows) I did have to upgrade all the postgis from 2.2 to 3.0 which went smoothly.
> On Nov 13, 2020, at 11:06 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote: > > When you manually run the pg_upgrade pg_ctl script the server starts but you cannot connect to any database in it correct? Yes. > What does pg_upgrade_server.log show when you do above? ----------------------------------------------------------------- pg_upgrade run on Fri Nov 13 10:11:45 2020 ----------------------------------------------------------------- command: "/usr/pgsql-9.5/bin/pg_ctl" -w -l "pg_upgrade_server.log" -D "/var/lib/pgsql/9.5/data" -o "-p 54320 -b -c listen_addresses=''-c unix_socket_permissions=0700 -c unix_socket_directories='/var/run/postgresql'" start >> "pg_upgrade_server.log"2>&1 waiting for server to start....2020-11-13 10:11:46 EST [204620]: [1-1] user=,db=,app=,client= LOG: ending log output tostderr 2020-11-13 10:11:46 EST [204620]: [2-1] user=,db=,app=,client= HINT: Future log output will go to log destination "syslog". done server started command: "/usr/pgsql-9.5/bin/pg_ctl" -w -D "/var/lib/pgsql/9.5/data" -o "" -m smart stop >> "pg_upgrade_server.log" 2>&1 waiting for server to shut down.... done server stopped command: "/usr/pgsql-13/bin/pg_ctl" -w -l "pg_upgrade_server.log" -D "/var/lib/pgsql/13/data" -o "-p 5432 -b -c synchronous_commit=off-c fsync=off -c full_page_writes=off -c vacuum_defer_cleanup_age=0 -c listen_addresses='' -c unix_socket_permissions=0700-c unix_socket_directories='/var/run/postgresql'" start >> "pg_upgrade_server.log" 2>&1 waiting for server to start....2020-11-13 10:11:54.944 EST [204724] LOG: redirecting log output to logging collector process 2020-11-13 10:11:54.944 EST [204724] HINT: Future log output will appear in directory "log". done server started command: "/usr/pgsql-13/bin/pg_ctl" -w -D "/var/lib/pgsql/13/data" -o "" -m smart stop >> "pg_upgrade_server.log" 2>&1 waiting for server to shut down....................... done server stopped command: "/usr/pgsql-13/bin/pg_ctl" -w -l "pg_upgrade_server.log" -D "/var/lib/pgsql/13/data" -o "-p 5432 -b -c synchronous_commit=off-c fsync=off -c full_page_writes=off -c vacuum_defer_cleanup_age=0 -c listen_addresses='' -c unix_socket_permissions=0700-c unix_socket_directories='/var/run/postgresql'" start >> "pg_upgrade_server.log" 2>&1 waiting for server to start....2020-11-13 10:12:15.902 EST [204782] LOG: redirecting log output to logging collector process 2020-11-13 10:12:15.902 EST [204782] HINT: Future log output will appear in directory "log". done server started command: "/usr/pgsql-13/bin/pg_ctl" -w -D "/var/lib/pgsql/13/data" -o "" -m fast stop >> "pg_upgrade_server.log" 2>&1 waiting for server to shut down.... done server stopped
On 11/13/20 8:19 AM, Jeremy Wilson wrote: > > >> On Nov 13, 2020, at 11:06 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote: >> >> When you manually run the pg_upgrade pg_ctl script the server starts but you cannot connect to any database in it correct? > > Yes. > >> What does pg_upgrade_server.log show when you do above? > > ----------------------------------------------------------------- > pg_upgrade run on Fri Nov 13 10:11:45 2020 > ----------------------------------------------------------------- > > > command: "/usr/pgsql-13/bin/pg_ctl" -w -l "pg_upgrade_server.log" -D "/var/lib/pgsql/13/data" -o "-p 5432 -b -c synchronous_commit=off-c fsync=off -c full_page_writes=off -c vacuum_defer_cleanup_age=0 -c listen_addresses='' -c unix_socket_permissions=0700-c unix_socket_directories='/var/run/postgresql'" start >> "pg_upgrade_server.log" 2>&1 > waiting for server to start....2020-11-13 10:12:15.902 EST [204782] LOG: redirecting log output to logging collector process > 2020-11-13 10:12:15.902 EST [204782] HINT: Future log output will appear in directory "log". > done > server started What shows up in the log file in "log" directory when you try to connect to a database? > > > command: "/usr/pgsql-13/bin/pg_ctl" -w -D "/var/lib/pgsql/13/data" -o "" -m fast stop >> "pg_upgrade_server.log" 2>&1 > waiting for server to shut down.... done > server stopped > -- Adrian Klaver adrian.klaver@aklaver.com
> On Nov 13, 2020, at 11:26 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote: > > What shows up in the log file in "log" directory when you try to connect to a database? 2020-11-13 10:14:35.821 EST [204797] LOG: starting PostgreSQL 13.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.3.1 20191121(Red Hat 8.3.1-5), 64-bit 2020-11-13 10:14:35.822 EST [204797] LOG: listening on IPv4 address "127.0.0.1", port 5432 2020-11-13 10:14:35.824 EST [204797] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432" 2020-11-13 10:14:35.828 EST [204797] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432" 2020-11-13 10:14:35.833 EST [204799] LOG: database system was shut down at 2020-11-13 10:12:15 EST 2020-11-13 10:14:35.845 EST [204797] LOG: database system is ready to accept connections 2020-11-13 10:14:45.416 EST [204797] LOG: received fast shutdown request 2020-11-13 10:14:45.418 EST [204797] LOG: aborting any active transactions 2020-11-13 10:14:45.421 EST [204797] LOG: background worker "logical replication launcher" (PID 204805) exited with exitcode 1 2020-11-13 10:14:45.421 EST [204800] LOG: shutting down 2020-11-13 10:14:45.442 EST [204797] LOG: database system is shut down
> On Nov 13, 2020, at 11:26 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote: > > What shows up in the log file in "log" directory when you try to connect to a database? I did a completely fresh initdb to get clean logs: bash-4.4$ cat 13/data/log/postgresql-Fri.log 2020-11-13 11:29:44.744 EST [205647] LOG: starting PostgreSQL 13.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.3.1 20191121(Red Hat 8.3.1-5), 64-bit 2020-11-13 11:29:44.744 EST [205647] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432" 2020-11-13 11:29:44.748 EST [205649] LOG: database system was shut down at 2020-11-13 11:29:19 EST 2020-11-13 11:29:44.755 EST [205647] LOG: database system is ready to accept connections free(): invalid pointer 2020-11-13 11:29:45.771 EST [205647] LOG: received smart shutdown request 2020-11-13 11:30:05.292 EST [205647] LOG: server process (PID 205659) was terminated by signal 6: Aborted 2020-11-13 11:30:05.292 EST [205647] LOG: terminating any other active server processes 2020-11-13 11:30:05.295 EST [205647] LOG: abnormal database system shutdown 2020-11-13 11:30:05.306 EST [205647] LOG: database system is shut down 2020-11-13 11:30:06.136 EST [205704] LOG: starting PostgreSQL 13.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.3.1 20191121(Red Hat 8.3.1-5), 64-bit 2020-11-13 11:30:06.136 EST [205704] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432" 2020-11-13 11:30:06.141 EST [205706] LOG: database system was shut down at 2020-11-13 11:30:05 EST 2020-11-13 11:30:06.148 EST [205704] LOG: database system is ready to accept connections 2020-11-13 11:30:06.195 EST [205712] FATAL: database "template1" does not exist 2020-11-13 11:30:06.200 EST [205704] LOG: received fast shutdown request 2020-11-13 11:30:06.201 EST [205704] LOG: aborting any active transactions 2020-11-13 11:30:06.204 EST [205704] LOG: background worker "logical replication launcher" (PID 205711) exited with exitcode 1 2020-11-13 11:30:06.204 EST [205707] LOG: shutting down 2020-11-13 11:30:06.219 EST [205704] LOG: database system is shut down
On 11/13/20 8:27 AM, Jeremy Wilson wrote: > > >> On Nov 13, 2020, at 11:26 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote: >> >> What shows up in the log file in "log" directory when you try to connect to a database? > > 2020-11-13 10:14:35.821 EST [204797] LOG: starting PostgreSQL 13.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.3.120191121 (Red Hat 8.3.1-5), 64-bit > 2020-11-13 10:14:35.822 EST [204797] LOG: listening on IPv4 address "127.0.0.1", port 5432 > 2020-11-13 10:14:35.824 EST [204797] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432" > 2020-11-13 10:14:35.828 EST [204797] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432" I'm pretty sure the above, two sockets, should not happen. Have to think on how that could come to pass. > 2020-11-13 10:14:35.833 EST [204799] LOG: database system was shut down at 2020-11-13 10:12:15 EST > 2020-11-13 10:14:35.845 EST [204797] LOG: database system is ready to accept connections > 2020-11-13 10:14:45.416 EST [204797] LOG: received fast shutdown request > 2020-11-13 10:14:45.418 EST [204797] LOG: aborting any active transactions > 2020-11-13 10:14:45.421 EST [204797] LOG: background worker "logical replication launcher" (PID 204805) exited with exitcode 1 > 2020-11-13 10:14:45.421 EST [204800] LOG: shutting down > 2020-11-13 10:14:45.442 EST [204797] LOG: database system is shut down > -- Adrian Klaver adrian.klaver@aklaver.com
On 11/13/20 8:31 AM, Jeremy Wilson wrote: > > >> On Nov 13, 2020, at 11:26 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote: >> >> What shows up in the log file in "log" directory when you try to connect to a database? > > I did a completely fresh initdb to get clean logs: Was the below from starting using the pg_upgrade version of pg_ctl start? > > bash-4.4$ cat 13/data/log/postgresql-Fri.log > 2020-11-13 11:29:44.744 EST [205647] LOG: starting PostgreSQL 13.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.3.120191121 (Red Hat 8.3.1-5), 64-bit > 2020-11-13 11:29:44.744 EST [205647] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432" > 2020-11-13 11:29:44.748 EST [205649] LOG: database system was shut down at 2020-11-13 11:29:19 EST > 2020-11-13 11:29:44.755 EST [205647] LOG: database system is ready to accept connections > free(): invalid pointer > 2020-11-13 11:29:45.771 EST [205647] LOG: received smart shutdown request > 2020-11-13 11:30:05.292 EST [205647] LOG: server process (PID 205659) was terminated by signal 6: Aborted > 2020-11-13 11:30:05.292 EST [205647] LOG: terminating any other active server processes > 2020-11-13 11:30:05.295 EST [205647] LOG: abnormal database system shutdown > 2020-11-13 11:30:05.306 EST [205647] LOG: database system is shut down > 2020-11-13 11:30:06.136 EST [205704] LOG: starting PostgreSQL 13.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.3.120191121 (Red Hat 8.3.1-5), 64-bit > 2020-11-13 11:30:06.136 EST [205704] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432" > 2020-11-13 11:30:06.141 EST [205706] LOG: database system was shut down at 2020-11-13 11:30:05 EST > 2020-11-13 11:30:06.148 EST [205704] LOG: database system is ready to accept connections > 2020-11-13 11:30:06.195 EST [205712] FATAL: database "template1" does not exist > 2020-11-13 11:30:06.200 EST [205704] LOG: received fast shutdown request > 2020-11-13 11:30:06.201 EST [205704] LOG: aborting any active transactions > 2020-11-13 11:30:06.204 EST [205704] LOG: background worker "logical replication launcher" (PID 205711) exited with exitcode 1 > 2020-11-13 11:30:06.204 EST [205707] LOG: shutting down > 2020-11-13 11:30:06.219 EST [205704] LOG: database system is shut down > -- Adrian Klaver adrian.klaver@aklaver.com
> On Nov 13, 2020, at 11:35 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote: > > On 11/13/20 8:31 AM, Jeremy Wilson wrote: >>> On Nov 13, 2020, at 11:26 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote: >>> >>> What shows up in the log file in "log" directory when you try to connect to a database? >> I did a completely fresh initdb to get clean logs: > > Was the below from starting using the pg_upgrade version of pg_ctl start? No, strictly from running pg_upgrade. Here’s the additional logs from manually running it: 2020-11-13 11:36:35.976 EST [205733] LOG: starting PostgreSQL 13.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.3.1 20191121(Red Hat 8.3.1-5), 64-bit 2020-11-13 11:36:35.976 EST [205733] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432" 2020-11-13 11:36:35.981 EST [205735] LOG: database system was shut down at 2020-11-13 11:30:06 EST 2020-11-13 11:36:35.989 EST [205733] LOG: database system is ready to accept connections
On 11/13/20 8:37 AM, Jeremy Wilson wrote: > > >> On Nov 13, 2020, at 11:35 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote: >> >> On 11/13/20 8:31 AM, Jeremy Wilson wrote: >>>> On Nov 13, 2020, at 11:26 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote: >>>> >>>> What shows up in the log file in "log" directory when you try to connect to a database? >>> I did a completely fresh initdb to get clean logs: >> >> Was the below from starting using the pg_upgrade version of pg_ctl start? > > No, strictly from running pg_upgrade. > > Here’s the additional logs from manually running it: > > 2020-11-13 11:36:35.976 EST [205733] LOG: starting PostgreSQL 13.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.3.120191121 (Red Hat 8.3.1-5), 64-bit > 2020-11-13 11:36:35.976 EST [205733] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432" > 2020-11-13 11:36:35.981 EST [205735] LOG: database system was shut down at 2020-11-13 11:30:06 EST > 2020-11-13 11:36:35.989 EST [205733] LOG: database system is ready to accept connections > This does not show trying to connect to a database. It would help to list the commands run and then the corresponding log portions. -- Adrian Klaver adrian.klaver@aklaver.com
> On Nov 13, 2020, at 11:39 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote: > > This does not show trying to connect to a database. It would help to list the commands run and then the corresponding logportions. bash-4.4$ "/usr/pgsql-13/bin/pg_ctl" -w -l "pg_upgrade_server.log" -D "/var/lib/pgsql/13/data" -o "-p 5432 -b -c synchronous_commit=off-c fsync=off -c full_page_writes=off -c vacuum_defer_cleanup_age=0 -c listen_addresses='' -c unix_socket_permissions=0700-c unix_socket_directories='/var/run/postgresql'" start waiting for server to start.... done server started bash-4.4$ psql postgres psql: error: could not connect to server: FATAL: database "postgres" does not exist --- 2020-11-13 11:39:38.378 EST [205747] LOG: starting PostgreSQL 13.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.3.1 20191121(Red Hat 8.3.1-5), 64-bit 2020-11-13 11:39:38.378 EST [205747] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432" 2020-11-13 11:39:38.382 EST [205749] LOG: database system was shut down at 2020-11-13 11:37:34 EST 2020-11-13 11:39:38.388 EST [205747] LOG: database system is ready to accept connections 2020-11-13 11:39:45.750 EST [205756] FATAL: database "postgres" does not exist
Jeremy Wilson <jwilson@clover.co> writes: > I did a completely fresh initdb to get clean logs: > ... > free(): invalid pointer > 2020-11-13 11:30:05.292 EST [205647] LOG: server process (PID 205659) was terminated by signal 6: Aborted This is highly significant. It suggests that you're getting bit by the postgis crash-in-atexit problem that a couple of people have reported (and which, unfortunately, we don't yet know the exact cause of). I now suspect that something similar is happening earlier in the process and preventing template1 from getting created, or maybe it's created but then we can't access it. Are you by any chance trying to preload any of the postgis-related extensions? If so, try not doing that. regards, tom lane
> On Nov 13, 2020, at 12:00 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Are you by any chance trying to preload any of the postgis-related > extensions? If so, try not doing that. Not sure what you mean by this - I’ve installed the postgis packages for 9.5 and 13 and the extensions are installed andworking in 9.5, but I’m not doing anything but initdb and then pg_upgrade for 13.
On 11/13/20 8:40 AM, Jeremy Wilson wrote: > > >> On Nov 13, 2020, at 11:39 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote: >> >> This does not show trying to connect to a database. It would help to list the commands run and then the correspondinglog portions. > > bash-4.4$ "/usr/pgsql-13/bin/pg_ctl" -w -l "pg_upgrade_server.log" -D "/var/lib/pgsql/13/data" -o "-p 5432 -b -c synchronous_commit=off-c fsync=off -c full_page_writes=off -c vacuum_defer_cleanup_age=0 -c listen_addresses='' -c unix_socket_permissions=0700-c unix_socket_directories='/var/run/postgresql'" start > waiting for server to start.... done > server started > bash-4.4$ psql postgres > psql: error: could not connect to server: FATAL: database "postgres" does not exist > > --- > > 2020-11-13 11:39:38.378 EST [205747] LOG: starting PostgreSQL 13.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.3.120191121 (Red Hat 8.3.1-5), 64-bit > 2020-11-13 11:39:38.378 EST [205747] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432" > 2020-11-13 11:39:38.382 EST [205749] LOG: database system was shut down at 2020-11-13 11:37:34 EST > 2020-11-13 11:39:38.388 EST [205747] LOG: database system is ready to accept connections > 2020-11-13 11:39:45.750 EST [205756] FATAL: database "postgres" does not exist > Hmm. You can still connect if you use?: /usr/pgsql-13/bin/pg_ctl -D /var/lib/pgsql/13/data/ -l logfile start If so that would to me point to one of the options in the -o string. I would try dropping/changing them one at a time to see what happens. I would say start with the items in the "-c synchronous_commit=off -c fsync=off -c full_page_writes=off" section. -- Adrian Klaver adrian.klaver@aklaver.com
> On Nov 13, 2020, at 12:06 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote: > > Hmm. You can still connect if you use?: > > /usr/pgsql-13/bin/pg_ctl -D /var/lib/pgsql/13/data/ -l logfile start Same result. bash-4.4$ /usr/pgsql-13/bin/pg_ctl -D /var/lib/pgsql/13/data/ -l logfile start waiting for server to start.... done server started bash-4.4$ psql postgres psql: error: could not connect to server: FATAL: database "postgres" does not exist
Bruce Momjian <bruce@momjian.us> writes: > On Fri, Nov 13, 2020 at 12:06:34PM -0500, Jeremy Wilson wrote: >> Not sure what you mean by this - I’ve installed the postgis packages for 9.5 and 13 and the extensions are installed andworking in 9.5, but I’m not doing anything but initdb and then pg_upgrade for 13. > I think he is asking about shared_preload_libraries, > local_preload_libraries, and session_preload_libraries. Yeah, but if Jeremy isn't touching the new cluster's config between initdb and pg_upgrade, those won't be set. I'm kind of baffled at this point. It seems pretty likely that this is related to the v13 postgis problems we've heard a few reports of, but the symptoms are a lot different. Best advice I can give is to go inquire on the postgis mailing lists as to whether they've figured out the "free(): invalid pointer" issue. (I assume that dropping postgis from the source DB is not an option...) regards, tom lane
dnf install --excludepkg proj --excludepkg proj-datumgrid postgis30_12 postgis30_12-devel postgis30_12-utils postgis30_12-client postgis30_12-docs
On Fri, Nov 13, 2020 at 7:01 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Bruce Momjian <bruce@momjian.us> writes:
> > On Fri, Nov 13, 2020 at 12:06:34PM -0500, Jeremy Wilson wrote:
> >> Not sure what you mean by this - I’ve installed the postgis packages for 9.5 and 13 and the extensions are installed and working in 9.5, but I’m not doing anything but initdb and then pg_upgrade for 13.
>
> > I think he is asking about shared_preload_libraries,
> > local_preload_libraries, and session_preload_libraries.
>
> Yeah, but if Jeremy isn't touching the new cluster's config between
> initdb and pg_upgrade, those won't be set.
>
> I'm kind of baffled at this point. It seems pretty likely that this
> is related to the v13 postgis problems we've heard a few reports of,
> but the symptoms are a lot different.
>
> Best advice I can give is to go inquire on the postgis mailing lists
> as to whether they've figured out the "free(): invalid pointer"
> issue. (I assume that dropping postgis from the source DB is not
> an option...)
This is not actually a PostGIS problem, it's a problem with our yum repository packaging.
The problem is that postgis, through gdal, ended up being linked to two different versions of proj at the same time.
You can check it by doing:
ldd /usr/pgsql-13/lib/postgis_raster-3.so | grep proj
If that shows up two different "proj" libraries, then you have that same problem.
In this case, uninstall the OS supplied "proj" library. If that removes postgis through dependency, let it and then install it with:
dnf install --excludepkg proj --excludepkg proj-datumgrid postgis30_12
as a workaround.
*If* the root cause is the same one, that is...
On Fri, Nov 13, 2020 at 7:01 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Bruce Momjian <bruce@momjian.us> writes:
> > On Fri, Nov 13, 2020 at 12:06:34PM -0500, Jeremy Wilson wrote:
> >> Not sure what you mean by this - I’ve installed the postgis packages for 9.5 and 13 and the extensions are installed and working in 9.5, but I’m not doing anything but initdb and then pg_upgrade for 13.
>
> > I think he is asking about shared_preload_libraries,
> > local_preload_libraries, and session_preload_libraries.
>
> Yeah, but if Jeremy isn't touching the new cluster's config between
> initdb and pg_upgrade, those won't be set.
>
> I'm kind of baffled at this point. It seems pretty likely that this
> is related to the v13 postgis problems we've heard a few reports of,
> but the symptoms are a lot different.
>
> Best advice I can give is to go inquire on the postgis mailing lists
> as to whether they've figured out the "free(): invalid pointer"
> issue. (I assume that dropping postgis from the source DB is not
> an option...)
This is not actually a PostGIS problem, it's a problem with our yum repository packaging.
The problem is that postgis, through gdal, ended up being linked to two different versions of proj at the same time.
You can check it by doing:
ldd /usr/pgsql-13/lib/postgis_raster-3.so | grep proj
If that shows up two different "proj" libraries, then you have that same problem.
In this case, uninstall the OS supplied "proj" library. If that removes postgis through dependency, let it and then install it with:
dnf install --excludepkg proj --excludepkg proj-datumgrid postgis30_12
as a workaround.
*If* the root cause is the same one, that is...
On Fri, Nov 13, 2020 at 7:10 PM Magnus Hagander <magnus@hagander.net> wrote:
dnf install --excludepkg proj --excludepkg proj-datumgrid postgis30_12 postgis30_12-devel postgis30_12-utils postgis30_12-client postgis30_12-docs
On Fri, Nov 13, 2020 at 7:01 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Bruce Momjian <bruce@momjian.us> writes:
> > On Fri, Nov 13, 2020 at 12:06:34PM -0500, Jeremy Wilson wrote:
> >> Not sure what you mean by this - I’ve installed the postgis packages for 9.5 and 13 and the extensions are installed and working in 9.5, but I’m not doing anything but initdb and then pg_upgrade for 13.
>
> > I think he is asking about shared_preload_libraries,
> > local_preload_libraries, and session_preload_libraries.
>
> Yeah, but if Jeremy isn't touching the new cluster's config between
> initdb and pg_upgrade, those won't be set.
>
> I'm kind of baffled at this point. It seems pretty likely that this
> is related to the v13 postgis problems we've heard a few reports of,
> but the symptoms are a lot different.
>
> Best advice I can give is to go inquire on the postgis mailing lists
> as to whether they've figured out the "free(): invalid pointer"
> issue. (I assume that dropping postgis from the source DB is not
> an option...)
This is not actually a PostGIS problem, it's a problem with our yum repository packaging.
The problem is that postgis, through gdal, ended up being linked to two different versions of proj at the same time.
You can check it by doing:
ldd /usr/pgsql-13/lib/postgis_raster-3.so | grep proj
If that shows up two different "proj" libraries, then you have that same problem.
In this case, uninstall the OS supplied "proj" library. If that removes postgis through dependency, let it and then install it with:
dnf install --excludepkg proj --excludepkg proj-datumgrid postgis30_12
... and that should of course be postgis30_13 if you're on PostgreSQL 13...
On 11/13/20 9:12 AM, Jeremy Wilson wrote: > > >> On Nov 13, 2020, at 12:06 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote: >> >> Hmm. You can still connect if you use?: >> >> /usr/pgsql-13/bin/pg_ctl -D /var/lib/pgsql/13/data/ -l logfile start > > Same result. > > bash-4.4$ /usr/pgsql-13/bin/pg_ctl -D /var/lib/pgsql/13/data/ -l logfile start > waiting for server to start.... done > server started > bash-4.4$ psql postgres > psql: error: could not connect to server: FATAL: database "postgres" does not exist > In this post: https://www.postgresql.org/message-id/793E4164-90E9-41F0-B74C-129E1DB76408%40clover.co it worked. Though in that case pg_upgrade was not run before you ran: /usr/pgsql-13/bin/pg_ctl -D /var/lib/pgsql/13/data/ -l logfile start Can you start with clean initdb and then run above command to see if you can connect. Then run: /usr/pgsql-13/bin/pg_ctl" -w -l "pg_upgrade_server.log" -D "/var/lib/pgsql/13/data" -o "-p 5432 -b -c synchronous_commit=off -c fsync=off -c full_page_writes=off -c vacuum_defer_cleanup_age=0 -c listen_addresses='' -c unix_socket_permissions=0700 -c unix_socket_directories='/var/lib/pgsql'" start and see if you can connect. Leave pg_upgrade out of the mix to see if it is indeed the issue. -- Adrian Klaver adrian.klaver@aklaver.com
> On Nov 13, 2020, at 1:10 PM, Magnus Hagander <magnus@hagander.net> wrote: > > The problem is that postgis, through gdal, ended up being linked to two different versions of proj at the same time. > > You can check it by doing: > ldd /usr/pgsql-13/lib/postgis_raster-3.so | grep proj > > If that shows up two different "proj" libraries, then you have that same problem. Just wanted to reply that this was indeed an issue. After removing the proj RPMs and reinstalling, the upgrade seems tobe running now. On a side note, you need to enable the PowerTools repo on CentOS to install PostGIS, otherwise it complains about the gdallibraries being missing. If I run into another failure I’ll update the list - thanks!
Hi, On Fri, 2020-11-13 at 14:19 -0500, Jeremy Wilson wrote: > > If that shows up two different "proj" libraries, then you have that > > same problem. > > Just wanted to reply that this was indeed an issue. After removing > the proj RPMs and reinstalling, the upgrade seems to be running now. (rpm packager speaking) Ouch. I partially fixed this problem (clean installations should not have this issue, as no other package requires PROJ, and we already have our own PROJ. Let me see what I can do to fix this completely. Regards, -- Devrim Gündüz Open Source Solution Architect, Red Hat Certified Engineer Twitter: @DevrimGunduz , @DevrimGunduzTR