Обсуждение: BUG #6094: Streaming replication does not catch up when writing enough data
BUG #6094: Streaming replication does not catch up when writing enough data
От
"David Hartveld"
Дата:
The following bug has been logged online:
Bug reference: 6094
Logged by: David Hartveld
Email address: david.hartveld@mendix.com
PostgreSQL version: 9.1-beta2
Operating system: Debian GNU/Linux 6.0.2 "Squeeze"
Description: Streaming replication does not catch up when writing
enough data
Details:
After creation of two new clusters, and setting them up as master and slave
(in async mode, according to the current 9.1 docs), the execution of a large
SQL script (creating a db, tables, sequences, etc., filling them with data
through COPY) runs properly on the master, but does not stream to the slave,
i.e. the slave does not catch up. In the master log, the following line is
printed many times:
2011-07-07 13:48:27 CEST LOG: could not send data to client: Connection
reset by peer
In the slave log, the following corresponding lines are printed, for each
log line on the master:
2011-07-07 13:48:27 CEST LOG: streaming replication successfully connected
to primary
2011-07-07 13:48:27 CEST LOG: record with zero length at 0/51E0010
2011-07-07 13:48:27 CEST FATAL: terminating walreceiver process due to
administrator command
cp: cannot stat `/walshipping/9.1/test/000000010000000000000005': No such
file or directory
2011-07-07 13:48:27 CEST LOG: record with zero length at 0/51E4010
cp: cannot stat `/walshipping/9.1/test/000000010000000000000005': No such
file or directory
The 'record with zero length' line is printed many times.
I have configured the clusters with the following 'script':
EDITOR=/usr/bin/vim
MASTER=pg-db-01
SLAVE=pg-db-02
PORT=3000
VERSION=9.1
CLUSTERNAME=test
BOTH
- Create 9.1 cluster on port 3000
# pg_createcluster -p $PORT $VERSION $CLUSTERNAME
- Add line 'host all all samenet trust' to pg_hba.conf.
# $EDITOR /etc/postgresql/$VERSION/$CLUSTERNAME/pg_hba.conf
- Listen on all IPs: Change 'listen_addresses' to '*' in postgresql.conf.
# $EDITOR /etc/postgresql/$VERSION/$CLUSTERNAME/postgresql.conf
MASTER
- Enable wal archiving. Set the following configuration parameters in
postgresql.conf
(and create directory /walshipping/9.1/test, owned by postgres):
wal_level = hot_standby
archive_mode = on
archive_command = 'cp -i %p /walshipping/9.1/test/%f < /dev/null'
# $EDITOR /etc/postgresql/$VERSION/$CLUSTERNAME/postgresql.conf
- To enable streaming replication, set the following configuration
parameters in postgresql.conf:
wal_keep_segments = 64 # * 16 MiB, 1 GiB disk space needed.
max_wal_senders = 1 # Or some other number at least equal to the number
of standby servers.
# $EDITOR /etc/postgresql/$VERSION/$CLUSTERNAME/postgresql.conf
- Also add line 'host replication postgres samenet trust' to pg_hba.conf
# $EDITOR /etc/postgresql/$VERSION/$CLUSTERNAME/pg_hba.conf
- Start the cluster.
# pg_ctlcluster $VERSION $CLUSTERNAME start
- Create a base backup for the slave.
# psql -U postgres -h localhost -p $PORT \
-c "SELECT pg_start_backup('base', true)"
# rsync -a /var/lib/postgresql/$VERSION/$CLUSTERNAME/*
/pgbackup/$VERSION/$CLUSTERNAME/
# psql -U postgres -h localhost -p $PORT \
-c "SELECT pg_stop_backup()"
# rm -rf /pgbackup/$VERSION/$CLUSTERNAME/{postmaster.pid,pg_xlog/*}
# cd /pgbackup/$VERSION
# tar jcvf $CLUSTERNAME.tar.bz2 ./$CLUSTERNAME/
SLAVE
- 'Restore' the created backup from the master.
# cd /var/lib/postgresql/$VERSION
# rm -rf $CLUSTERNAME.orig
# mv -f $CLUSTERNAME $CLUSTERNAME.orig
# tar jxvf /$CLUSTERNAME.tar.bz2
- Create recovery.conf with the following configuration parameters:
standby_mode = 'on'
primary_conninfo = 'host=$MASTER port=$PORT user=postgres'
restore_command = 'cp /walshipping/$VERSION/$CLUSTERNAME/%f %p'
# $EDITOR /var/lib/postgresql/$VERSION/$CLUSTERNAME/recovery.conf
- Start the cluster.
# chown -R postgres.postgres $CLUSTERNAME
# chmod 0700 $CLUSTERNAME
# pg_ctlcluster $VERSION $CLUSTERNAME start
Re: BUG #6094: Streaming replication does not catch up when writing enough data
От
Simon Riggs
Дата:
On Thu, Jul 7, 2011 at 1:05 PM, David Hartveld <david.hartveld@mendix.com> wrote: > > The following bug has been logged online: > > Bug reference: =A0 =A0 =A06094 > Logged by: =A0 =A0 =A0 =A0 =A0David Hartveld > Email address: =A0 =A0 =A0david.hartveld@mendix.com > PostgreSQL version: 9.1-beta2 > Operating system: =A0 Debian GNU/Linux 6.0.2 "Squeeze" > Description: =A0 =A0 =A0 =A0Streaming replication does not catch up when = writing > enough data > Details: > > After creation of two new clusters, and setting them up as master and sla= ve > (in async mode, according to the current 9.1 docs), the execution of a la= rge > SQL script (creating a db, tables, sequences, etc., filling them with data > through COPY) runs properly on the master, but does not stream to the sla= ve, > i.e. the slave does not catch up. In the master log, the following line is > printed many times: Your output indicates that there is a problem in your replication setup and this is why the slave does not catch up. This is not a performance issue. It is either a bug in replication, or a user configuration issue. Since few things have changed in 9.1 in this area, at the moment the balance of probablity if user error. If you can provide a more isolated bug report we may be able to investigate. This is being discussed in a thread on the General list and there is no reason to post twice. --=20 =A0Simon Riggs=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 http:/= /www.2ndQuadrant.com/ =A0PostgreSQL Development, 24x7 Support, Training & Services