After upgrade to 9.3, streaming replication fails to start

Поиск
Список
Период
Сортировка
От Jeff Ross
Тема After upgrade to 9.3, streaming replication fails to start
Дата
Msg-id 527A7EF7.80106@wykids.org
обсуждение исходный текст
Ответы Re: After upgrade to 9.3, streaming replication fails to start
Список pgsql-general
Hi all,

I recently installed 9.3 into my work servers.

With 9.2 I had streaming replication working.  Primary server is on
nirvana, standby server is on dukkha.

Upgrade on nirvana went just fine--no problems there.  Initial
installation on dukkha went fine as well, but streaming replication
fails to start with the error "PANIC:  could not locate a valid
checkpoint record".

Here's what I'm doing to start streaming replication:

On the standby:

_postgresql@dukkha:/var/postgresql $ initdb -D /var/postgresql/data.93.5432
The files belonging to this database system will be owned by user
"_postgresql".
This user must also own the server process.

The database cluster will be initialized with locale "C".
The default database encoding has accordingly been set to "SQL_ASCII".
The default text search configuration will be set to "english".

Data page checksums are disabled.

creating directory /var/postgresql/data.93.5432 ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
creating configuration files ... ok
creating template1 database in /var/postgresql/data.93.5432/base/1 ... ok
initializing pg_authid ... ok
initializing dependencies ... ok
creating system views ... ok
loading system objects' descriptions ... ok
creating collations ... not supported on this platform
creating conversions ... ok
creating dictionaries ... ok
setting privileges on built-in objects ... ok
creating information schema ... ok
loading PL/pgSQL server-side language ... ok
vacuuming database template1 ... ok
copying template1 to template0 ... ok
copying template1 to postgres ... ok
syncing data to disk ... ok

WARNING: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

Success. You can now start the database server using:

     postgres -D /var/postgresql/data.93.5432
or
     pg_ctl -D /var/postgresql/data.93.5432 -l logfile start

_postgresql@dukkha:/var/postgresql $ cp pg_hba.conf data.93.5432/
_postgresql@dukkha:/var/postgresql $ echo "include
'/var/postgresql/my_postgresql.conf'" >> data.93.5432/postgresql.conf
_postgresql@dukkha:/var/postgresql $ tail -1 data.93.5432/postgresql.conf
include '/var/postgresql/my_postgresql.conf'
_postgresql@dukkha:/var/postgresql $ cat my_postgresql.conf
#my settings
listen_addresses = 'localhost,10.31.16.3'               # what IP
address(es) to listen on;
unix_socket_directories = '/var/postgresql/'            # (change
requires restart)
log_line_prefix = '<%u%%%d> '                   # special values:
log_timezone = 'America/Denver'
timezone = 'America/Denver'
log_statement = 'all'                   # none, ddl, mod, all
archive_mode = on
wal_level = hot_standby                 # minimal, archive, or hot_standby
archive_command = 'cd .'
default_statistics_target = 50 # pgtune wizard 2009-12-19
maintenance_work_mem = 120MB # pgtune wizard 2009-12-19
constraint_exclusion = on # pgtune wizard 2009-12-19
checkpoint_completion_target = 0.9 # pgtune wizard 2009-12-19
effective_cache_size = 1408MB # pgtune wizard 2009-12-19
work_mem = 12MB # pgtune wizard 2009-12-19
wal_buffers = 8MB # pgtune wizard 2009-12-19
checkpoint_segments = 16 # pgtune wizard 2009-12-19
shared_buffers = 480MB # pgtune wizard 2009-12-19
max_connections = 100 # pgtune wizard 2009-12-19
hot_standby = on

_postgresql@dukkha:/var/postgresql $ cat recovery.conf
standby_mode = 'on'
primary_conninfo = 'host=nirvana.internal port=5432 user=wykids_replication'
restore_command = 'cp /wal/%f %p'
archive_cleanup_command = 'pg_archivecleanup /wal %r'

Switch to the primary:

_postgresql@nirvana:/var/postgresql $ cat start_hot_standby.sh
#!/bin/sh
backup_label=wykids_`date +%Y-%m-%d`
#remove any existing wal files on the standby
ssh dukkha.internal rm -rf /wal/*
#stop the standby server if it is running
ssh dukkha.internal sudo /usr/local/bin/svc -d /service/postgresql.5432
psql -c "select pg_start_backup('$backup_label');" template1
rsync \
         --copy-links \
         --delete \
         --exclude=backup_label \
         --exclude=postgresql.conf \
         --exclude=recovery.done \
         -e ssh -avz /var/postgresql/data.93.5432/ \
         dukkha.internal:/var/postgresql/data.93.5432/
ssh dukkha.internal rm -f /var/postgresql/data.93.5432/pg_xlog/*
ssh dukkha.internal rm -f
/var/postgresql/data.93.5432/pg_xlog/archive_status/*
ssh dukkha.internal rm -f /var/postgresql/data.93.5432/pg_log/*
ssh dukkha.internal rm -f /var/postgresql/data.93.5432/postmaster.pid
ssh dukkha.internal ln -s /var/postgresql/recovery.conf
/var/postgresql/data.93.5432/recovery.conf
psql -c "select pg_stop_backup();" template1
ssh dukkha.internal sudo /usr/local/bin/svc -u /service/postgresql.5432


_postgresql@nirvana:/var/postgresql $ sh -x start_hot_standby.sh
+ date +%Y-%m-%d
+ backup_label=wykids_2013-11-06
+ ssh dukkha.internal rm -rf /wal/*
+ ssh dukkha.internal sudo /usr/local/bin/svc -d /service/postgresql.5432
+ rsync -e ssh /wal/ dukkha.internal:/wal/
skipping directory .
+ psql -c select pg_start_backup('wykids_2013-11-06'); template1
  pg_start_backup
-----------------
  0/3D000024
(1 row)

+ rsync --copy-links --delete --exclude=backup_label
--exclude=postgresql.conf --exclude=recovery.done -e ssh -avz
/var/postgresql/data.93.5432/ dukkha.internal:/var/postgresql/data.93.5432/
sending incremental file list
./
(snip rsync output)

sent 239,677,033 bytes  received 225,961 bytes  1,139,681.68 bytes/sec
total size is 1,426,541,526  speedup is 5.95
+ ssh dukkha.internal rm -f /var/postgresql/data.93.5432/pg_xlog/*
rm: /var/postgresql/data.93.5432/pg_xlog/archive_status: is a directory
+ ssh dukkha.internal rm -f
/var/postgresql/data.93.5432/pg_xlog/archive_status/*
+ ssh dukkha.internal rm -f /var/postgresql/data.93.5432/pg_log/*
+ ssh dukkha.internal rm -f /var/postgresql/data.93.5432/postmaster.pid
+ ssh dukkha.internal ln -s /var/postgresql/recovery.conf
/var/postgresql/data.93.5432/recovery.conf
+ psql -c select pg_stop_backup(); template1
NOTICE:  pg_stop_backup complete, all required WAL segments have been
archived
  pg_stop_backup
----------------
  0/3D0000DC
(1 row)

+ ssh dukkha.internal sudo /usr/local/bin/svc -u /service/postgresql.5432

The standby server fails to start with the following:

2013-11-06 07:50:23.436056500 <%> LOG:  database system was interrupted;
last known up at 2013-11-06 07:47:07 MST
2013-11-06 07:50:23.495849500 <%> LOG:  entering standby mode
2013-11-06 07:50:23.495854500 <%> LOG:  invalid primary checkpoint record
2013-11-06 07:50:23.495858500 <%> LOG:  invalid secondary checkpoint record
2013-11-06 07:50:23.495862500 <%> PANIC:  could not locate a valid
checkpoint record

I'm sure I'm missing something simple here, but I'm not seeing what it is.

Thanks!

Jeff Ross


В списке pgsql-general по дате отправления:

Предыдущее
От: "Joshua D. Drake"
Дата:
Сообщение: Re: Is it advisable to pg_upgrade directly from 9.0 to 9.3?
Следующее
От: Jeff Janes
Дата:
Сообщение: Re: After upgrade to 9.3, streaming replication fails to start