Обсуждение: hot standby PSQL 9.1 Windows 2008 Servers
Hi All, I have implemented hot standby for PostgreSQL with a group of two Primary and Standby in windows 2008 servers. Currently below are the settings: 1. Archiving is enabled on primary, stored on network storage. 2. Asynchronous Streaming replication from primary to standby. wal-senders=5, wal_keep_segments=32. 3. I am checking heartbeat of primary from standby, when it is down i am restarting service by changing postgresql.conf and recovery.conf settings. I do base backup only first time on standby when it is going to be replicated. when ever primary goes down, standby becomes primary and primary becomes standby when primary comes up. When primary becomes standby I am restoring data from WAL archive and start postgres service streaming replication to connect to primary. This setup is working. I have tested for few days in my network with huge data input to the primary and restarting the servers multiple times. I have observed these below specific errors in standby server and standby server is not replicating in anymore. *1. Invalid primary and secondary checkpoints. Very rare But happend 2. contrecord is requested by 0/DF000020 -- Major one 3. record with zero length at 0/DF000078, invalid record length at 0/DF000078* These errors are shown up when primary switching to standby by recovering data from archive and standby stops replicating. *I am not able to start the service for that I should kill all the process of postgres and then do base backup and then start service on stnadby.* I don't have any prior experience working with postgreSQL. Please tell me what are the most reliable settings for Hot standby with streaming replication to work ? two 2008 servers and 1 network storage available. How to avoid above errors ?? -- View this message in context: http://postgresql.1045698.n5.nabble.com/hot-standby-PSQL-9-1-Windows-2008-Servers-tp5708637.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.
On Mon, May 14, 2012 at 8:18 AM, chinnaobi <chinnaobi@gmail.com> wrote: > I do base backup only first time on standby when it is going to be > replicated. when ever primary goes down, standby becomes primary and > primary becomes standby when primary comes up. When primary becomes standby > I am restoring data from WAL archive and start postgres service streaming > replication to connect to primary. > > This setup is working. I don't think this is safe. The primary might have WAL that never made it to the standby, in which case the two machines will be out of sync with each other and all sorts of bad stuff could happen. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Dear Robert, Thank you very much for the reply. You mean when the primary which is going to switch its role to standby might not have sent all the WAL records to the standby and If it is switched to standby it has more WAL records than the standby which is now serves as primary. Is it ?? It is actually the standby server which has to be restored from archive when it is switching to primary right .. Not the primary which is switching to standby ?? Regards, Reddy. -- View this message in context: http://postgresql.1045698.n5.nabble.com/hot-standby-PSQL-9-1-Windows-2008-Servers-tp5708637p5709495.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.
chinnaobi wrote: > You mean when the primary which is going to switch its role to > standby might not have sent all the WAL records to the standby and > If it is switched to standby it has more WAL records than the > standby which is now serves as primary. Is it ?? What happens when there is a network fault between the primary and the standby, but not between the primary and some of the clients updating it? Similarly, if this is asynchronous replication, what if there have been commits on the primary which were still in the network buffer when the primary crashed? Clean automated failover is not a trivial task. If you are writing your own, it would be best to follow the steps recommended in the documentation rather closely. -Kevin
Dear Kevin, Thank you for your reply. Yeah I am writing an application using powershell, it's true it is not trivial and especially a guy like me who has no idea on database. You raised all the cases which I am muddling with, But currently I am testing this setup: Always standby server is configured from base backup and restore from storage server then start streaming replication(asynchronous). Base backups are taken frequently. I am sure there is some data loss during switching. Still researching how to do it clean. suggest me if you have any good papers on this .. Reddy. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Re-hot-standby-PSQL-9-1-Windows-2008-Servers-tp5710824p5710830.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.
Sorry to mention, In my setup the primary and standby servers receive same traffic, so no issue with the "network fault between the primary and the standby, but not between the primary and some of the clients updating it" -- View this message in context: http://postgresql.1045698.n5.nabble.com/Re-hot-standby-PSQL-9-1-Windows-2008-Servers-tp5710824p5710832.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.
On Tue, May 22, 2012 at 12:15 PM, chinnaobi <chinnaobi@gmail.com> wrote: > You mean when the primary which is going to switch its role to standby might > not have sent all the WAL records to the standby and If it is switched to > standby it has more WAL records than the standby which is now serves as > primary. Is it ?? Yes, that is possible. Or the standby might have received all the WAL records but not be caught up in terms of replaying them. > It is actually the standby server which has to be restored from archive when > it is switching to primary right .. Not the primary which is switching to > standby ?? If you want to promote a standby, you can just do it (pg_ctl promote).If you have a master that you want to demote to a standby,you've got to resync it to whatever the current master is. I understand repmgr has some tooling to help automate that, although I have not played with it myself. In any event rsync can be a big help in reducing the resync time. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company