base backup/restore + streaming replication => weirdness

Поиск
Список
Период
Сортировка
От domehead100
Тема base backup/restore + streaming replication => weirdness
Дата
Msg-id 1361571063448-5746342.post@n5.nabble.com
обсуждение исходный текст
Список pgsql-admin
I have a smallish Postgres 9.0 database with Primary and Standby instances.

These instances are set up with streaming replication from the Primary to
the Standby.  The primary archives WAL files to a shared directory that is
accessible from the Standby.  This is a hot standby, so transactions are
received over TCP.

We had an issue this week where the shared directory where WAL files were
being archived (/pgsql_wal) ran out of space.

To restart replication, I performed a base backup on Primary (tar $PGDATA to
/pgsql_wal) and then performed a base restore (untar) on Standby.

After this, the Standby is staying in recovery mode (recovery.conf never
gets changed to recovery.done), and my check_replication.sh script shows
strange results.  The sequence number for the Primary (first item below) is
totally different from either the received or applied sequence numbers on
the Standby.

Primary:
 pg_current_xlog_location
--------------------------
 1E/D5C40A40           <= this looks strange
(1 row)

Standby, last received:
 pg_last_xlog_receive_location
-------------------------------
 E/BF68BD08
(1 row)

Standby, last applied:
 pg_last_xlog_replay_location
------------------------------
 E/BF68BD08
(1 row)


I can connect to the Standby, and a select query seems to indicate that the
databases are in sync (they return the same value for max(<primary_key>) on
a table that is constantly receiving inserts).

One concern is that my tar command apparently did not exclude the files in
$PGDATA/pg_xlog, so those got untarred on the Standby.  Could that be a
problem?

Here's my basebackup.sh:
#! /bin/sh
# Base Backup script for streaming replication

BACKUP_FILE=/pgsql_wal/backup/pg_base_backup.tgz

psql -c "SELECT pg_start_backup('$BACKUP_FILE', true)" postgres

rm -rf $BACKUP_FILE

nice -n 10 tar czvpf $BACKUP_FILE --exclude={"$PGDATA/pg_xlog/*"} $PGDATA

psql -c "SELECT pg_stop_backup()" postgres

And here's my baserestore.h:
#! /bin/sh
# Base Recovery script for streaming replication (run on Standby)
# Run as postgres user
# Postgres should be stopped

DATE=`date +%Y_%M_%d`
CONF_BACKUP_DIR=/tmp/pgsql_conf_backup_$DATE
BASE_BACKUP_FILE=/pgsql_wal/backup/pg_base_backup.tgz

#backup config files
mkdir $CONF_BACKUP_DIR
cp $PGDATA/*.conf $CONF_BACKUP_DIR
cp $PGDATA/recovery.done $CONF_BACKUP_DIR

#blow away existing data directory
rm -rf $PGDATA

#untar base backup file
cd /
tar xzvf $BASE_BACKUP_FILE

#copy configs back
cp $CONF_BACKUP_DIR/*.conf $PGDATA
cp $CONF_BACKUP_DIR/recovery.done $PGDATA/recovery.conf





--
View this message in context:
http://postgresql.1045698.n5.nabble.com/base-backup-restore-streaming-replication-weirdness-tp5746342.html
Sent from the PostgreSQL - admin mailing list archive at Nabble.com.


В списке pgsql-admin по дате отправления:

Предыдущее
От: Ned Wolpert
Дата:
Сообщение: Re: Database corruption event, unlockable rows, possibly bogus virtual xids? (-1/4444444444)
Следующее
От: Charles Sprickman
Дата:
Сообщение: logging full queries separately