Обсуждение: pg_archivecleanup with multiple slaves

Поиск
Список
Период
Сортировка

pg_archivecleanup with multiple slaves

От
Ben Lancaster
Дата:
Hi,

First post, forgive me if this is better suited to pgsql-general.

I've got streaming replication set up with two slave servers (PostgreSQL 9.0 on Ubuntu 10.04 LTS). The master pushes
theWAL to an NFS export, which is in turn mounted on and picked up by the two slaves. 

The problem I have is that pg_archivecleanup (running on one of the slaves) was removing WAL logs before the other
slavehad picked up the changes, thus breaking replication for the second slave. As an interim fix, I simply disabled
theautomatic cleanup and figured I'd worry about it later. 

Well, later is now and I'm running out of HDD space. So, what's the best (or perhaps, correct) way to handle cleaning
upWAL archives when there's more than one slave? My first thought was prefixing the pg_archivecleanup call in
recovery.conf'sarchive_cleanup_command with a "sleep" of a few seconds to allow both slaves to pick up changes before
WALfiles are cleaned up, but I'm afraid I'll end up with some weird race conditions, with loads of sleeping processes
waitingto cleanup WAL files that have previously been cleaned up by a recently awoken process. 

Thanks in advance,

Ben


Re: pg_archivecleanup with multiple slaves

От
Tim
Дата:
I think you are using Log-Shipping not Streaming-Replication
http://www.postgresql.org/docs/9.0/static/different-replication-solutions.html

I would just make 2 copies of the WAL file one for each slave in different folders.
That way if one slave is offline for a period of time it can catch up when it comes back online.

On Fri, May 20, 2011 at 6:59 AM, Ben Lancaster <benlancaster@holler.co.uk> wrote:
Hi,

First post, forgive me if this is better suited to pgsql-general.

I've got streaming replication set up with two slave servers (PostgreSQL 9.0 on Ubuntu 10.04 LTS). The master pushes the WAL to an NFS export, which is in turn mounted on and picked up by the two slaves.

The problem I have is that pg_archivecleanup (running on one of the slaves) was removing WAL logs before the other slave had picked up the changes, thus breaking replication for the second slave. As an interim fix, I simply disabled the automatic cleanup and figured I'd worry about it later.

Well, later is now and I'm running out of HDD space. So, what's the best (or perhaps, correct) way to handle cleaning up WAL archives when there's more than one slave? My first thought was prefixing the pg_archivecleanup call in recovery.conf's archive_cleanup_command with a "sleep" of a few seconds to allow both slaves to pick up changes before WAL files are cleaned up, but I'm afraid I'll end up with some weird race conditions, with loads of sleeping processes waiting to cleanup WAL files that have previously been cleaned up by a recently awoken process.

Thanks in advance,

Ben


--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Re: pg_archivecleanup with multiple slaves

От
"Kevin Grittner"
Дата:
Tim  wrote:

> I would just make 2 copies of the WAL file one for each slave in
> different folders.
> That way if one slave is offline for a period of time it can
> catch up when it comes back online.

If you used hard links, that wouldn't even take any extra disk
space (beyond the directory space).  I would copy to a staging
directory, hard link (cp -l) to each of the target directories,
and then delete from the staging area.

-Kevin



Re: pg_archivecleanup with multiple slaves

От
Ben Lancaster
Дата:
On 20 May 2011, at 12:53, Tim wrote:

> I think you are using Log-Shipping not Streaming-Replication
> http://www.postgresql.org/docs/9.0/static/different-replication-solutions.html

I'm using streaming replication with log shipping as a fallback as per walkthrough here:
http://brandonkonkle.com/blog/2010/oct/20/postgres-9-streaming-replication-and-django-balanc/

> I would just make 2 copies of the WAL file one for each slave in different folders.
> That way if one slave is offline for a period of time it can catch up when it comes back online.

...hence using a combination of the two.

For now, I've reverted to do the following on an hourly basis:

for f in `find /srv/pg_backups -ctime +0.5`; do pg_archivecleanup /srv/pg_backups `basename $f`; done;

...that is, find anything that was changed within the past day and a half and archive it. Not particularly elegant, but
itgives me plenty of time for a slave to go down and replay the logs. 



Re: pg_archivecleanup with multiple slaves

От
Fujii Masao
Дата:
On Fri, May 20, 2011 at 7:59 PM, Ben Lancaster
<benlancaster@holler.co.uk> wrote:
> The problem I have is that pg_archivecleanup (running on one of the slaves) was removing WAL logs before the other
slavehad picked up the changes, thus breaking replication for the second slave. As an interim fix, I simply disabled
theautomatic cleanup and figured I'd worry about it later. 

I'm afraid there is no clean solution. In order to address this
problem, we probably
should change the master for 9.2 so that it collects the information
about the cutoff
point from each standby and calls pg_archivecleanup.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Re: pg_archivecleanup with multiple slaves

От
Tim Lewis
Дата:
I don't actually use streaming replication, but what exactly is the problem with the hard link for each slave solution, and the slaves handling there own pg_archivecleanup?

On Fri, May 20, 2011 at 11:30 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
On Fri, May 20, 2011 at 7:59 PM, Ben Lancaster
<benlancaster@holler.co.uk> wrote:
> The problem I have is that pg_archivecleanup (running on one of the slaves) was removing WAL logs before the other slave had picked up the changes, thus breaking replication for the second slave. As an interim fix, I simply disabled the automatic cleanup and figured I'd worry about it later.

I'm afraid there is no clean solution. In order to address this
problem, we probably
should change the master for 9.2 so that it collects the information
about the cutoff
point from each standby and calls pg_archivecleanup.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin



--

Noodle
Connecting People, Content & Capabilities within the Enterprise


Toll Free: 866-258-6951 x 701
Tim.Lewis@vialect.com
http://www.vialect.com

Noodle is a product of Vialect Inc

Follow Noodle on Twitter
http://www.twitter.com/noodle_news

Re: pg_archivecleanup with multiple slaves

От
Simon Riggs
Дата:
On Fri, May 20, 2011 at 4:58 PM, Tim Lewis <Tim.Lewis@vialect.com> wrote:

> I don't actually use streaming replication, but what exactly is the problem
> with the hard link for each slave solution, and the slaves handling there
> own pg_archivecleanup?

Doing so creates a single point of failure.

We had problems with links in earlier versions, so we learned the hard
way to stay clear of them.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services