Обсуждение: Performance impact of log streaming replication

Поиск
Список
Период
Сортировка

Performance impact of log streaming replication

От
Andy
Дата:
What is the expected performance impact of the log streaming replication in 9.0?

In the past I've used the log shipping replication of MySQL and it caused the performance of the master to drop by
almost50%. Just wondered if Postgresql's replication is expected to behave similarly. 




Re: Performance impact of log streaming replication

От
Craig Ringer
Дата:
On 21/04/2010 5:15 AM, Andy wrote:
> What is the expected performance impact of the log streaming replication in 9.0?
>
> In the past I've used the log shipping replication of MySQL and it caused the performance of the master to drop by
almost50%. Just wondered if Postgresql's replication is expected to behave similarly. 

Pg already writes the logs required as part of normal operation. Most
likely that performance drop with MySQL was due to the extra disk I/O of
logging activity, which you shouldn't see with Pg.

All streaming replication adds in performance terms should be some
network I/O and the need to keep those logs around until the slave has
received them.

--
Craig Ringer

Re: Performance impact of log streaming replication

От
Greg Smith
Дата:
Andy wrote:
> What is the expected performance impact of the log streaming replication in 9.0?
> In the past I've used the log shipping replication of MySQL and it caused the performance of the master to drop by
almost50%. Just wondered if Postgresql's replication is expected to behave similarly. 
>

Should only be in the low single digit percentages, except in some
unusual cases--for example, there's an optimization for creating a new
table and populating it all in one transaction that has to be disabled
when replication is turned on, so that particular operation can be much
slower.

MySQL replication works by shipping a binary log of statements around,
and that's claimed to have a similar overhead:  about 1%:
http://dev.mysql.com/doc/refman/5.0/en/binary-log.html  However, its
performance is sensitive to whether that logging is going to a high
performance disk or not.  It's common for people to throw those onto
network shares and the like, which can cripple the master if you do that
badly.

The built-in replication in PostgreSQL saves log files of disk block
changes instead, ones that are already being created by the database
anyway for crash recovery.  The only additional overhead beyond standard
operation is copying those files somewhere else--you're always paying
most of the logging overhead all the time in standard, unreplicated
PostgreSQL.  The whole thing is quite fast and robust, without any weird
limitations like those listed at
http://dev.mysql.com/doc/refman/5.0/en/replication-features.html

The main downside of the approach taken in PostgreSQL compared to what
MySQL does is that the slaves are not as decoupled from what the master
does in Postgres, which makes it harder to get scale-out replication
going still.  You can certainly do it right now, it's just harder than
most people would like it to be to setup.

--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com   www.2ndQuadrant.us