Обсуждение: Feature request: pg_basebackup --force
Magnus, all: It seems a bit annoying to have to do an rm -rf * $PGDATA/ before resynching a standby using pg_basebackup. This means thatI still need to wrap basebackup in a shell script, instead of having it do everything for me ... especially if I havemultiple tablespaces. Couldn't we have a --force option which would clear all data and tablespace directories before resynching? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com San Francisco
On Sat, Apr 9, 2011 at 20:26, Joshua Berkus <josh@agliodbs.com> wrote: > Magnus, all: > > It seems a bit annoying to have to do an rm -rf * $PGDATA/ before resynching a standby using pg_basebackup. This meansthat I still need to wrap basebackup in a shell script, instead of having it do everything for me ... especially ifI have multiple tablespaces. > > Couldn't we have a --force option which would clear all data and tablespace directories before resynching? That could certainly be useful, yes. But I have a feeling whomever tries to get that into 9.1 will be killed - but it's certainly good to put ont he list of things for 9.2. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Magnus, > That could certainly be useful, yes. But I have a feeling whomever > tries to get that into 9.1 will be killed - but it's certainly good to > put ont he list of things for 9.2. Oh, no question. At some point in 9.2 we should also discuss how basebackup considers "emtpy" directories. Because theother thing I find myself constantly scripting is replacing the conf files on the replica after the base backup sync. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com San Francisco
On Sat, Apr 9, 2011 at 2:26 PM, Joshua Berkus <josh@agliodbs.com> wrote: > It seems a bit annoying to have to do an rm -rf * $PGDATA/ before resynching a standby using pg_basebackup. This meansthat I still need to wrap basebackup in a shell script, instead of having it do everything for me ... especially ifI have multiple tablespaces. > > Couldn't we have a --force option which would clear all data and tablespace directories before resynching? What would be even more useful us some kind of support for differential copy, a la rsync. (Now I'm waiting for someone to tell me this is a pipe dream.) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Sat, Apr 9, 2011 at 2:26 PM, Joshua Berkus <josh@agliodbs.com> wrote: >> Couldn't we have a --force option which would clear all data and tablespace directories before resynching? > What would be even more useful us some kind of support for > differential copy, a la rsync. > (Now I'm waiting for someone to tell me this is a pipe dream.) Not so much a pipe dream as reinventing the wheel. Why not use rsync? regards, tom lane
On Sun, Apr 10, 2011 at 12:35 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Sat, Apr 9, 2011 at 2:26 PM, Joshua Berkus <josh@agliodbs.com> wrote: >>> Couldn't we have a --force option which would clear all data and tablespace directories before resynching? > >> What would be even more useful us some kind of support for >> differential copy, a la rsync. > >> (Now I'm waiting for someone to tell me this is a pipe dream.) > > Not so much a pipe dream as reinventing the wheel. Why not use rsync? It's not integrated and I doubt it's conveniently available on Windows. One of the biggest problems with our replication functionality right now is that it's hard to set up. We've actually done a good job making the very simplest case (one slave, no archive) reasonably simple, but how many PostgreSQL users do you think can manage to set up SR + HS + archiving, with two slaves that can use the archive if they fall too far behind the master, but with the archive regularly trimmed to the farthest-back segment that is still needed? We have pg_archivecleanup, but AIUI that's only smart enough to handle the one-standby case. Admittedly, the above is a slightly different problem, but I think it all points in the direction of needing more automation and more ease of use. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Sun, Apr 10, 2011 at 12:41 PM, Robert Haas <robertmhaas@gmail.com> wrote: > It's not integrated and I doubt it's conveniently available on Windows. > > One of the biggest problems with our replication functionality right > now is that it's hard to set up. We've actually done a good job > making the very simplest case (one slave, no archive) reasonably > simple, but how many PostgreSQL users do you think can manage to set > up SR + HS + archiving, with two slaves that can use the archive if > they fall too far behind the master, but with the archive regularly > trimmed to the farthest-back segment that is still needed? > > We have pg_archivecleanup, but AIUI that's only smart enough to handle > the one-standby case. > > Admittedly, the above is a slightly different problem, but I think it > all points in the direction of needing more automation and more ease > of use. And let me also note that the difficulty of getting this all exactly right is one of the things that causes people to come up with creative solutions like this: http://archives.postgresql.org/pgsql-hackers/2010-12/msg02514.php That's why we need to put it in a box, tie a bow around it, and put up a big sign that says "do not look into laser with remaining eye". -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 10.04.2011 20:06, Robert Haas wrote: > On Sun, Apr 10, 2011 at 12:41 PM, Robert Haas<robertmhaas@gmail.com> wrote: >> Admittedly, the above is a slightly different problem, but I think it >> all points in the direction of needing more automation and more ease >> of use. > > And let me also note that the difficulty of getting this all exactly > right is one of the things that causes people to come up with creative > solutions like this: > > http://archives.postgresql.org/pgsql-hackers/2010-12/msg02514.php > > That's why we need to put it in a box, tie a bow around it, and put up > a big sign that says "do not look into laser with remaining eye". That's exactly what pg_basebackup does. Once you move into more complicated scenarios with multiple standbys and WAL archiving, it's inevitably going to be more complicated to set up. That doesn't mean that we can't make it easier - we can and we should - but I don't think the common complaint that replication is hard to set up is true anymore. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > That's exactly what pg_basebackup does. Once you move into more > complicated scenarios with multiple standbys and WAL archiving, > it's inevitably going to be more complicated to set up. > > That doesn't mean that we can't make it easier - we can and we > should - but I don't think the common complaint that replication > is hard to set up is true anymore. Getting back to the rsync-like behavior, which is what led the conversation in this direction, I think -- the point of that seemed to be to allow similar ease of use for those activating a replicated node as the master, without requiring that the entire data directory be sent over a slow WAN or Internet path when the delta needed to modify what was already at the remote end to match the new master might be orders of magnitude less than data than that. The intelligence to support that would be a fraction of what is in rsync. In fact, since we might want to ignore hint bit differences where possible, rsync might not work nearly as well as a home-grown solution. -Kevin