Обсуждение: 9.1 Streaming Replication

Поиск
Список
Период
Сортировка

9.1 Streaming Replication

От
Tony Nelson
Дата:
I believe I have successfully setup streaming replication but I have a couple of nagging questions.

If I reboot one of the streaming targets, I see that it syncs right back up with the primary and I see new data appear
inthe read only copy. 

I distribute wal segments to the targets using rsync.  One target is in the same rack as the primary, the other is a
coupleof states aways.  That seems to be working just fine.  What I can't seem to figure out is what they are used for?
The target servers seem to connect to the primary and sync from that, not from the wal logs.  Are they there in case of
anextended disconnect? 

Also, when my target server comes up it always seems to complain that it can't find a wal archive.  It's always the one
thatis currently be written on the primary, that hasn't been archived yet.  Is that normal? 

For example, I cleared out the log file on the remote target and rebooted it.  This is what it logged at startup:

tnelson@nj-dr1:~$ cat /var/log/postgresql/postgresql-9.1-main.log
2012-12-13 00:14:19 EST LOG:  database system was shut down in recovery at 2012-12-13 00:11:51 EST
2012-12-13 00:14:20 EST LOG:  entering standby mode
cp: cannot stat `/var/lib/postgresql/wal_archive/000000010000000100000098': No such file or directory
2012-12-13 00:14:20 EST LOG:  redo starts at 1/98524320
2012-12-13 00:14:20 EST LOG:  consistent recovery state reached at 1/9870DA38
2012-12-13 00:14:20 EST LOG:  invalid magic number 0000 in log file 1, segment 152, offset 7397376
2012-12-13 00:14:20 EST LOG:  database system is ready to accept read only connections
cp: cannot stat `/var/lib/postgresql/wal_archive/000000010000000100000098': No such file or directory
2012-12-13 00:14:22 EST FATAL:  could not connect to the primary server: could not connect to server: No route to host
                Is the server running on host "192.168.39.41" and accepting
                TCP/IP connections on port 5432?

cp: cannot stat `/var/lib/postgresql/wal_archive/000000010000000100000098': No such file or directory
cp: cannot stat `/var/lib/postgresql/wal_archive/000000010000000100000098': No such file or directory
2012-12-13 00:14:25 EST LOG:  streaming replication successfully connected to primary


The log file 0098 is currently being written on the primary server.  When it finishes, my archive command will copy it
toa local directory on the primary and the use rsync to distribute it to my 2 target servers. 

The TCP connection error I hadn't seen before, but it seems like that was transient:

On the target..

tnelson@nj-dr1:~$ ps -ef | grep receive
postgres  1158   915  0 00:14 ?        00:00:00 postgres: wal receiver process   streaming 1/98FE4000

On the primary..

root@ihdb1:/var/lib/postgresql# ps -ef | grep send
postgres  9742  9480  0 Dec12 ?        00:00:09 postgres: wal sender process postgres 192.168.39.42(37235) streaming
1/98FFA560
postgres 11126  9480  0 00:14 ?        00:00:01 postgres: wal sender process postgres 192.168.6.5(50685) streaming
1/98FFA560

The id's don't match now because the web app is running it's nightly jobs which will keep the database busy for a few
hours,but everything certainly seems to be working fine. 

Has anyone implemented a nagios monitor for these processes?

Thanks in advance for any help.

Tony Nelson



Since 1982, Starpoint Solutions has been a trusted source of human capital and solutions. We are committed to our
clients,employees, environment, community and social concerns.  We foster an inclusive culture based on trust, respect,
honestyand solid performance. Learn more about Starpoint and our social responsibility at
http://www.starpoint.com/social_responsibility

This email message from Starpoint Solutions LLC is for the sole use of  the intended recipient(s) and may contain
confidentialand privileged  information.  Any unauthorized review, use, disclosure or distribution is prohibited.  If
youare not the intended recipient, please contact the sender by reply email and destroy all copies of the original
message. Opinions, conclusions and other information in this message that do not relate to the official business of
StarpointSolutions shall be understood as neither given nor endorsed by it. 


Re: 9.1 Streaming Replication

От
Devrim GÜNDÜZ
Дата:
Hi,

On Thu, 2012-12-13 at 00:21 -0500, Tony Nelson wrote:

> I distribute wal segments to the targets using rsync.  One target is
> in the same rack as the primary, the other is a couple of states
> aways.  That seems to be working just fine.  What I can't seem to
> figure out is what they are used for?  The target servers seem to
> connect to the primary and sync from that, not from the wal logs.  Are
> they there in case of an extended disconnect?

When standby goes offline for a while (like you restart it), SR slave
tries to catch master first by replaying xlogs that have been
transferred to it, and then connects to walsender to get the new data.
That's why you need to transfer WAL segments.

> Also, when my target server comes up it always seems to complain that
> it can't find a wal archive.  It's always the one that is currently be
> written on the primary, that hasn't been archived yet.  Is that
> normal?

When standby comes up, it tries to read every not-applied WAL archive
first (as I wrote above), while looking for the next one to apply. If
next one is not available, then it tries to connect to walsender on
master node, and hopefully comes in sync with master.

That said, you may want to take a look at wal_keep_segments parameter,
if you haven't done already:

http://www.postgresql.org/docs/devel/static/runtime-config-replication.html

Regards,

<snip>

> Has anyone implemented a nagios monitor for these processes?

IIRC check_postgres (http://bucardo.org/wiki/Check_postgres) has a
monitoring stuff for this.

-HTH.

Regards,
--
Devrim GÜNDÜZ
Principal Systems Engineer @ EnterpriseDB: http://www.enterprisedb.com
PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer
Community: devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr
http://www.gunduz.org  Twitter: http://twitter.com/devrimgunduz

Вложения