Обсуждение: Postgresql Replication Performance

Поиск
Список
Период
Сортировка

Postgresql Replication Performance

От
sgupta
Дата:
I am doing POC on Posgtresql replication. I am using latest version of
postgresql i.e. 9.1. There are multiple replication solutions avaliable in
the market (PGCluster, Pgpool-II, Slony-I). Postgresql also provide in-built
replication solutions (Streaming replication, Warm Standby and hot standby).
I am confused which solution is best for the financial application for which
I am doing POC. The application will write around 160 million records with
row size of 2.5 KB in database. My questions is for following scenarios
which replication solution will be suitable:

If I would require replication for backup purpose only
If I would require to scale the reads
If I would require High Avaliability and Consistency
Also It will be very helpful if you can share the perfomance or experience
with postgresql replication solutions.

Thanks

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Postgresql-Replication-Performance-tp5107278p5107278.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.

Re: Postgresql Replication Performance

От
Aleksej Trofimov
Дата:
On 12/29/2011 11:33 AM, sgupta wrote:
I am doing POC on Posgtresql replication. I am using latest version of
postgresql i.e. 9.1. There are multiple replication solutions avaliable in
the market (PGCluster, Pgpool-II, Slony-I). Postgresql also provide in-built
replication solutions (Streaming replication, Warm Standby and hot standby).
I am confused which solution is best for the financial application for which
I am doing POC. The application will write around 160 million records with
row size of 2.5 KB in database. My questions is for following scenarios
which replication solution will be suitable:

If I would require replication for backup purpose only
If I would require to scale the reads
If I would require High Avaliability and Consistency
Also It will be very helpful if you can share the perfomance or experience
with postgresql replication solutions.

Thanks

--
View this message in context: http://postgresql.1045698.n5.nabble.com/Postgresql-Replication-Performance-tp5107278p5107278.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


Which replication solution will be suitable depends on your needs and database architecture.
Starting with PGCluster, I can say only, that PGCluster last released in 2005 year, so you can not use it with Postgres 9.0 =)
Slony-I is a good solution if you want to have cascade replication from Slave to Slave or you want to replicate only several parts of your database (because Slony performs table level like replication)
PGPool-II is an advanced load balancer and pooling solution. Which also has replication support. Pgpool-II is query based replication utility, which performs queries on several database servers. If you are looking for performance and stability I do not recommend using PGPool as replication software.
Postgres Streaming replication is WAL based replication, so using this type of replication you will have absolutely identical database servers, what is best choice for HA and scaling reads. Also this choice is not practically affecting performance, because it is not adding any latency to database layer.

Also you could read about difference between Slony and Streaming replications here http://scanningpages.wordpress.com/2010/10/09/9-0-streaming-replication-vs-slony/


-- 
Best regards

Aleksej Trofimov

Re: Postgresql Replication Performance

От
Claudio Freire
Дата:
On Thu, Dec 29, 2011 at 11:33 AM, Aleksej Trofimov
<aleksej.trofimov@ruptela.lt> wrote:
> Postgres Streaming replication is WAL based replication, so using this type
> of replication you will have absolutely identical database servers, what is
> best choice for HA and scaling reads. Also this choice is not practically
> affecting performance, because it is not adding any latency to database
> layer.

Let me chime in, because I'm in a similar situation. I'm preparing a
POC WAL-replicated environment, and testing up until now has been
inconclusive since we lack the kind of hardware in our test
environment. I know I should require it, testing on similar hardware
is the only way to get reliable results, but getting the budget
approved would take way too long, and right now we're in a hurry to
scale reads.

So getting the hardware is not an option, my option is asking those
who have the experience :-)

I gather WAL replication introduces only a few possible bottlenecks.

First, network bandwidth between master and slaves, and my app does
write a lot - our monitoring tools show, today, an average of 1MB/s
writes on the WAL array, with peaks exceeding 8MB/s, which can easily
saturate our lowly 100Mb/s links. No worries, we can upgrade to 1Gb/s
links.

Second, is that WAL activity on streaming replication or WAL shipping
is documented to contain more data than on non-replicated setups. What
is not clear is how much more data. This not only affects our network
bandwidth estimations, but also I/O load on the master server, slowing
writes (and some reads that cannot happen on the slave).

So, my question is, in your experience, how much of an increase in WAL
activity can be expected?

Re: Postgresql Replication Performance

От
Merlin Moncure
Дата:
On Thu, Dec 29, 2011 at 3:33 AM, sgupta <saurabh.b85@gmail.com> wrote:
> I am doing POC on Posgtresql replication. I am using latest version of
> postgresql i.e. 9.1. There are multiple replication solutions avaliable in
> the market (PGCluster, Pgpool-II, Slony-I). Postgresql also provide in-built
> replication solutions (Streaming replication, Warm Standby and hot standby).
> I am confused which solution is best for the financial application for which
> I am doing POC. The application will write around 160 million records with
> row size of 2.5 KB in database. My questions is for following scenarios
> which replication solution will be suitable:
>
> If I would require replication for backup purpose only
> If I would require to scale the reads
> If I would require High Avaliability and Consistency
> Also It will be very helpful if you can share the perfomance or experience
> with postgresql replication solutions.

The built in HS/SR integrates with the postgres engine (over the WAL
system) at a very low level and is going to be generally faster and
more robust.  More importantly, it has a very low administrative
overhead -- the underlying mechanism of log shipping has been tweaked
and refined continually since PITR was released in 8.0.  Once you've
done it a few times, it's a five minute procedure to replicate a
database (not counting, heh, the base database copy).

The main disadvantage of HS/SR is inflexibility: you get an exact
replica of a database cluster.  Slony (which is a trigger based
system) and pgpool (which is statement replication) can do a lot of
funky things that hs/sr can't do -- so they definitely fill a niche
depending on what your requirements are.

merlin

Re: Postgresql Replication Performance

От
Aleksej Trofimov
Дата:
On 12/29/2011 05:00 PM, Claudio Freire wrote:
> Second, is that WAL activity on streaming replication or WAL shipping
> is documented to contain more data than on non-replicated setups. What
> is not clear is how much more data. This not only affects our network
> bandwidth estimations, but also I/O load on the master server, slowing
> writes (and some reads that cannot happen on the slave).
Our database has about 2MB/s writes on the WAL array, we had about 160
IOPS in average when replications was switched off, and 165-170 IOPS in
replication. This I think could be explained with statistical error, so
we have not experienced any I/O load on our master server since
replication was configured.


--
Best regards

Aleksej Trofimov


Re: Postgresql Replication Performance

От
Saurabh
Дата:
Thank you all for the valuable information. Now we have decide to go
with streaming replication. I did the setup on machine and it is
working good. Now I have to implement the automatic failover. Please
share a solution for the same.

Saurabh Gupta

Re: Postgresql Replication Performance

От
Aleksej Trofimov
Дата:
On 01/04/2012 12:41 PM, Saurabh wrote:
> Thank you all for the valuable information. Now we have decide to go
> with streaming replication. I did the setup on machine and it is
> working good. Now I have to implement the automatic failover. Please
> share a solution for the same.
>
> Saurabh Gupta
>
You ca use pgpool-II for automatic failover and connection cache. This
article is good enough
http://pgpool.projects.postgresql.org/contrib_docs/simple_sr_setting/index.html

Also do not forget to configure Postgres max_connections >= (pgpool)
num_init_children*max_pool if you'll use connections cache.

--
Best regards

Aleksej Trofimov