Обсуждение: Streaming Replication (Master Delta Sync)
Hi We are using streaming replication for c++ Postgres-9.1 server. We have setup 1 master and 1 slave and streaming replication is working fine. On checking the x-log status for both master and slave, we observed that the master is quite ahead of the slave all the time. Now we have 2 questions Que 1) When Master goes down these is some delta which is not applied to the standby server (which is now, my new master) and the trigger has happened. How should I sync that delta part to the new Master (older standby server)? Que 2) How can I further reduce the gap between the Master and Standby? I tried increasing the max_wal_sender=5 number but I can see only 1 process running for sender. Regards, Parkirat Singh Bagga. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Streaming-Replication-Master-Delta-Sync-tp5728704.html Sent from the PostgreSQL - admin mailing list archive at Nabble.com.
Also, does the start or fast failover based on the content of trigger file apply to Steaming replication as well? -- View this message in context: http://postgresql.1045698.n5.nabble.com/Streaming-Replication-Master-Delta-Sync-tp5728704p5728706.html Sent from the PostgreSQL - admin mailing list archive at Nabble.com.
On Wed, Oct 17, 2012 at 1:34 PM, Parkirat Bagga <parkiratbagga@gmail.com> wrote: > Hi > > We are using streaming replication for c++ Postgres-9.1 server. > > We have setup 1 master and 1 slave and streaming replication is working > fine. > > On checking the x-log status for both master and slave, we observed that the > master is quite ahead of the slave all the time. Now we have 2 questions > > Que 1) When Master goes down these is some delta which is not applied to the > standby server (which is now, my new master) and the trigger has happened. > How should I sync that delta part to the new Master (older standby server)? rsync of the WAL files? > > Que 2) How can I further reduce the gap between the Master and Standby? I > tried increasing the max_wal_sender=5 number but I can see only 1 process > running for sender. As far as I'm aware, a server can only connect to and use 1 max_wal_sender at any time, so increasing the number isn't going to speed anything up, only make it possible to add additional servers. You need to determine why there is lag, as normally there should not be any at all. Are both master & standby identical HW? Is the standby running anything other than postgresql ?
Thanks Netllama for the reply. Does rsysc sync's the partial logs as well. As I would be doing the rsync from old master to new master (when the old master recovers), there might be some partial logs present in the old master? Delay is there because both the machines are on cloud and they are in different regions. I was thinking in terms of having multiple connections so that the transfer between 2 regions may become faster. Also, I have another question about trigger file content being smart or fast matters in streaming replication? Regards, Parkirat Singh Bagga -- View this message in context: http://postgresql.1045698.n5.nabble.com/Streaming-Replication-Master-Delta-Sync-tp5728704p5728709.html Sent from the PostgreSQL - admin mailing list archive at Nabble.com.
On Wed, Oct 17, 2012 at 2:01 PM, Parkirat Bagga <parkiratbagga@gmail.com> wrote: > Thanks Netllama for the reply. > > Does rsysc sync's the partial logs as well. As I would be doing the rsync > from old master to new master (when the old master recovers), there might be > some partial logs present in the old master? I don't think that's how WAL works. The log is either complete, or it doesn't exist. Anyway, rsync will sync whatever exists when invoked. I'm sure that its got some crazy options to exclude files based on certain criteria if you wanted to investigate them. > > Delay is there because both the machines are on cloud and they are in > different regions. I was thinking in terms of having multiple connections so > that the transfer between 2 regions may become faster. So if you knew what was causing the delay, why are you asking how to fix it? Regardless, are you saying that you've got a slow link between the master & standby, and the latency is what is causing the delay?
Thanks for the clarifications. For the slow link path, we don't have much option as the data transfer happens over the internet. I was trying to figure out the ways to reduce the gap as less as possible, for which I have read somewhere that we should have a sufficient value for max_wal_sender, and I was thinking that I can increase the data transfer happening by increasing the wal senders to gain some advantage. That is why, I asked that question. Thanks again. Regards, Parkirat Singh Bagga. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Streaming-Replication-Master-Delta-Sync-tp5728704p5728713.html Sent from the PostgreSQL - admin mailing list archive at Nabble.com.
Lonni J Friedman wrote: >> Does rsysc sync's the partial logs as well. As I would be doing the rsync >> from old master to new master (when the old master recovers), there might be >> some partial logs present in the old master? > > I don't think that's how WAL works. The log is either complete, or it > doesn't exist. Anyway, rsync will sync whatever exists when invoked. > I'm sure that its got some crazy options to exclude files based on > certain criteria if you wanted to investigate them. It's not true that a WAL is either complete or nonexistent. We should distinguish between active WALs (in pg_xlog) and archived WALs. Active WALs are not necessarily complete - there is always one that is being written to and hence incomplete. Archived WALs are always complete, but unless the postmaster on the primary is down, it could be that rsync copies a partial WAL archive that is just being written. Still the advice to rsync WALs is right. That will allow the standby to catch up as much as possible. An incomplete active WAL is no problem, it will be recovered as far as possible. >> Delay is there because both the machines are on cloud and they are in >> different regions. I was thinking in terms of having multiple connections so >> that the transfer between 2 regions may become faster. There is always only one connection. Is the change rate on the database high? A silly question: are the clocks on primary and standby fairly in sync or may it be time skew between those machines that you observe? Yours, Laurenz Albe
Thanks Albe. For the delay between 2 machines, the data insertions is happening at 8K tps and each record is around 1KB. So a total of 8Mb/s of data is getting inserted in the Master PG. The slave is not in different region but in different availability zone (same region) on EC2. Sorry for the typo above. We are using x1.large machines and PG-9.1 for both Master and Slave. What, I have observed that there is always only one sender process sending the data. Is it possible with any configuration that I can optimize this system for more current and less overhead on master. We are not thinking in-terms of long running queries. Regards, Parkirat Singh Bagga. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Streaming-Replication-Master-Delta-Sync-tp5728704p5728876.html Sent from the PostgreSQL - admin mailing list archive at Nabble.com.
Parkirat Bagga wrote: > What, I have observed that there is always only one sender process sending > the data. Is it possible with any configuration that I can optimize this > system for more current and less overhead on master. We are not thinking > in-terms of long running queries. Replication cannot be parallelized. If the network is the bottleneck, do you think that two connections would be faster than one? Yours, Laurenz Albe