Обсуждение: Postgresql Replication Performance
I am doing POC on Posgtresql replication. I am using latest version of postgresql i.e. 9.1. There are multiple replication solutions avaliable in the market (PGCluster, Pgpool-II, Slony-I). Postgresql also provide in-built replication solutions (Streaming replication, Warm Standby and hot standby). I am confused which solution is best for the financial application for which I am doing POC. The application will write around 160 million records with row size of 2.5 KB in database. My questions is for following scenarios which replication solution will be suitable: If I would require replication for backup purpose only If I would require to scale the reads If I would require High Avaliability and Consistency Also It will be very helpful if you can share the perfomance or experience with postgresql replication solutions. Thanks -- View this message in context: http://postgresql.1045698.n5.nabble.com/Postgresql-Replication-Performance-tp5107278p5107278.html Sent from the PostgreSQL - performance mailing list archive at Nabble.com.
On 12/29/2011 11:33 AM, sgupta wrote:
Which replication solution will be suitable depends on your needs and database architecture.
Starting with PGCluster, I can say only, that PGCluster last released in 2005 year, so you can not use it with Postgres 9.0 =)
Slony-I is a good solution if you want to have cascade replication from Slave to Slave or you want to replicate only several parts of your database (because Slony performs table level like replication)
PGPool-II is an advanced load balancer and pooling solution. Which also has replication support. Pgpool-II is query based replication utility, which performs queries on several database servers. If you are looking for performance and stability I do not recommend using PGPool as replication software.
Postgres Streaming replication is WAL based replication, so using this type of replication you will have absolutely identical database servers, what is best choice for HA and scaling reads. Also this choice is not practically affecting performance, because it is not adding any latency to database layer.
Also you could read about difference between Slony and Streaming replications here http://scanningpages.wordpress.com/2010/10/09/9-0-streaming-replication-vs-slony/
I am doing POC on Posgtresql replication. I am using latest version of postgresql i.e. 9.1. There are multiple replication solutions avaliable in the market (PGCluster, Pgpool-II, Slony-I). Postgresql also provide in-built replication solutions (Streaming replication, Warm Standby and hot standby). I am confused which solution is best for the financial application for which I am doing POC. The application will write around 160 million records with row size of 2.5 KB in database. My questions is for following scenarios which replication solution will be suitable: If I would require replication for backup purpose only If I would require to scale the reads If I would require High Avaliability and Consistency Also It will be very helpful if you can share the perfomance or experience with postgresql replication solutions. Thanks -- View this message in context: http://postgresql.1045698.n5.nabble.com/Postgresql-Replication-Performance-tp5107278p5107278.html Sent from the PostgreSQL - performance mailing list archive at Nabble.com.
Which replication solution will be suitable depends on your needs and database architecture.
Starting with PGCluster, I can say only, that PGCluster last released in 2005 year, so you can not use it with Postgres 9.0 =)
Slony-I is a good solution if you want to have cascade replication from Slave to Slave or you want to replicate only several parts of your database (because Slony performs table level like replication)
PGPool-II is an advanced load balancer and pooling solution. Which also has replication support. Pgpool-II is query based replication utility, which performs queries on several database servers. If you are looking for performance and stability I do not recommend using PGPool as replication software.
Postgres Streaming replication is WAL based replication, so using this type of replication you will have absolutely identical database servers, what is best choice for HA and scaling reads. Also this choice is not practically affecting performance, because it is not adding any latency to database layer.
Also you could read about difference between Slony and Streaming replications here http://scanningpages.wordpress.com/2010/10/09/9-0-streaming-replication-vs-slony/
-- Best regards Aleksej Trofimov
On Thu, Dec 29, 2011 at 11:33 AM, Aleksej Trofimov <aleksej.trofimov@ruptela.lt> wrote: > Postgres Streaming replication is WAL based replication, so using this type > of replication you will have absolutely identical database servers, what is > best choice for HA and scaling reads. Also this choice is not practically > affecting performance, because it is not adding any latency to database > layer. Let me chime in, because I'm in a similar situation. I'm preparing a POC WAL-replicated environment, and testing up until now has been inconclusive since we lack the kind of hardware in our test environment. I know I should require it, testing on similar hardware is the only way to get reliable results, but getting the budget approved would take way too long, and right now we're in a hurry to scale reads. So getting the hardware is not an option, my option is asking those who have the experience :-) I gather WAL replication introduces only a few possible bottlenecks. First, network bandwidth between master and slaves, and my app does write a lot - our monitoring tools show, today, an average of 1MB/s writes on the WAL array, with peaks exceeding 8MB/s, which can easily saturate our lowly 100Mb/s links. No worries, we can upgrade to 1Gb/s links. Second, is that WAL activity on streaming replication or WAL shipping is documented to contain more data than on non-replicated setups. What is not clear is how much more data. This not only affects our network bandwidth estimations, but also I/O load on the master server, slowing writes (and some reads that cannot happen on the slave). So, my question is, in your experience, how much of an increase in WAL activity can be expected?
On Thu, Dec 29, 2011 at 3:33 AM, sgupta <saurabh.b85@gmail.com> wrote: > I am doing POC on Posgtresql replication. I am using latest version of > postgresql i.e. 9.1. There are multiple replication solutions avaliable in > the market (PGCluster, Pgpool-II, Slony-I). Postgresql also provide in-built > replication solutions (Streaming replication, Warm Standby and hot standby). > I am confused which solution is best for the financial application for which > I am doing POC. The application will write around 160 million records with > row size of 2.5 KB in database. My questions is for following scenarios > which replication solution will be suitable: > > If I would require replication for backup purpose only > If I would require to scale the reads > If I would require High Avaliability and Consistency > Also It will be very helpful if you can share the perfomance or experience > with postgresql replication solutions. The built in HS/SR integrates with the postgres engine (over the WAL system) at a very low level and is going to be generally faster and more robust. More importantly, it has a very low administrative overhead -- the underlying mechanism of log shipping has been tweaked and refined continually since PITR was released in 8.0. Once you've done it a few times, it's a five minute procedure to replicate a database (not counting, heh, the base database copy). The main disadvantage of HS/SR is inflexibility: you get an exact replica of a database cluster. Slony (which is a trigger based system) and pgpool (which is statement replication) can do a lot of funky things that hs/sr can't do -- so they definitely fill a niche depending on what your requirements are. merlin
On 12/29/2011 05:00 PM, Claudio Freire wrote: > Second, is that WAL activity on streaming replication or WAL shipping > is documented to contain more data than on non-replicated setups. What > is not clear is how much more data. This not only affects our network > bandwidth estimations, but also I/O load on the master server, slowing > writes (and some reads that cannot happen on the slave). Our database has about 2MB/s writes on the WAL array, we had about 160 IOPS in average when replications was switched off, and 165-170 IOPS in replication. This I think could be explained with statistical error, so we have not experienced any I/O load on our master server since replication was configured. -- Best regards Aleksej Trofimov
Thank you all for the valuable information. Now we have decide to go with streaming replication. I did the setup on machine and it is working good. Now I have to implement the automatic failover. Please share a solution for the same. Saurabh Gupta
On 01/04/2012 12:41 PM, Saurabh wrote: > Thank you all for the valuable information. Now we have decide to go > with streaming replication. I did the setup on machine and it is > working good. Now I have to implement the automatic failover. Please > share a solution for the same. > > Saurabh Gupta > You ca use pgpool-II for automatic failover and connection cache. This article is good enough http://pgpool.projects.postgresql.org/contrib_docs/simple_sr_setting/index.html Also do not forget to configure Postgres max_connections >= (pgpool) num_init_children*max_pool if you'll use connections cache. -- Best regards Aleksej Trofimov