Обсуждение: Clarification on start/stop backup

Поиск
Список
Период
Сортировка

Clarification on start/stop backup

От
Scott Whitney
Дата:
I'll be moving to PG9 (hopefully soon...probably 6 weeks).

At that time, I'll be setting up hot-standby with streaming replication to 2 sites. Off-siting my pgdumps nightly is no
longergoing to be possible in the very near future, due to the size of the dumps. 

So...what I had planned to do was setup my production 9.x, setup my streaming standby (same network) 9.x and setup my
disasteroff-site (here at the office) also 9.x. Each one will do pg_dump at some point (nightly, probably) to ensure
thatI've got actual backup files available at each location. Yes, they'll be possibly-inconsistent, but only with one
another,and that's a very minor issue for the dump files. 

Now, when I do the directory rsync/tar (in this case tar), I can bring it up pretty quickly on the standby that is
thereat the data center. However, of course, I need to also set it up here at my office. Which amounts to me driving
backto the office, copying it over, and starting up PG (assuming I don't get interrupted 20 times walking in the door). 

So...something like this:

SELECT pg_start_backup()
tar off my pg directory
SELECT pg_stop_backup()

My question is this:

Can I do stop_backup after I've tgzed to an external hard drive or do I have to wait to do stop_backup until both
slavesare actually online? 

I _think_ that I'm merely telling the source db server that "I have my possibly-inconsistent file system backup, you
cango back to what you were doing," and then when the slave(s) come up, they start replaying the WAL files until they
catchup then use network communication to stay in sync. 

Is that a correct understanding of the process?


Re: Clarification on start/stop backup

От
"Albe Laurenz"
Дата:
Scott Whitney wrote:
> I'll be moving to PG9 (hopefully soon...probably 6 weeks).
> 
> At that time, I'll be setting up hot-standby with streaming replication to 2 sites. Off-siting my
> pgdumps nightly is no longer going to be possible in the very near future, due to the size of the
> dumps.
> 
> So...what I had planned to do was setup my production 9.x, setup my streaming standby (same network)
> 9.x and setup my disaster off-site (here at the office) also 9.x. Each one will do pg_dump at some
> point (nightly, probably) to ensure that I've got actual backup files available at each location. Yes,
> they'll be possibly-inconsistent, but only with one another, and that's a very minor issue for the
> dump files.
> 
> Now, when I do the directory rsync/tar (in this case tar), I can bring it up pretty quickly on the
> standby that is there at the data center. However, of course, I need to also set it up here at my
> office. Which amounts to me driving back to the office, copying it over, and starting up PG (assuming
> I don't get interrupted 20 times walking in the door).
> 
> So...something like this:
> 
> SELECT pg_start_backup()
> tar off my pg directory
> SELECT pg_stop_backup()
> 
> My question is this:
> 
> Can I do stop_backup after I've tgzed to an external hard drive or do I have to wait to do stop_backup
> until both slaves are actually online?
> 
> I _think_ that I'm merely telling the source db server that "I have my possibly-inconsistent file
> system backup, you can go back to what you were doing," and then when the slave(s) come up, they start
> replaying the WAL files until they catch up then use network communication to stay in sync.
> 
> Is that a correct understanding of the process?

Roughly, yes.

You can run pg_stop_backup() as soon as your "tar" command is done.

You will need all archived WAL files to be copied over to the standby
machine as soon as they are written.  They are necessary for the standby
to catch up to the master.

Yours,
Laurenz Albe

Migrating to PG 9.2 (LONG, SORRY)

От
Scott Whitney
Дата:
Hello, everyone. I've brought this up before, but it takes us quite a bit of time to orchestrate this on our back-end
fora variety of very good reasons. 

Now, we're going to be migrating to PG 9.2 in the near future, and I've been asked to vet my plan to ensure that I'm
nottalking crazy talk (or, you know, to find out that I am, and ensure that I truly understand what I think I
understandbefore I screw over our infrastructure...). 

Background:
Single PG server hosts several hundred databases on a single cluster. Performance was hideous until I realized that
autovacuumnaptime was too low. Since I turned that up (from 3 seconds, 3 workers on a 16-proc box hosting hundreds of
databases),performance has increased drastically, to say the least. We're running 8.4.4. 

There's a 2nd PG server (for demo and training purposes) which has nothing to do with the above, logically speaking.

Hopeful Future:
Single production PG server on v9.2.x (latest 9.2) with replication enabled. That 2nd PG server I was talking about has
2separate postmasters on it. The one on 5432 will be replicating from the production server. The one on the other port
isserving up the demo/training data, and I don't care to replicate that. My reasoning on port 5432 for the replication
stuffis in case the production server goes down, I merely point to the promoted replicant, and all my configurations
lookingfor 5432 do not need to be changed. 

I do this via steps listed below in "Initial replication."

Ok. I now have 2 PG servers replicating my production data at my data center. For one further step, I need to replicate
offsitefor disaster purposes. Basically repeat the replication steps for the first set. Specifically, I do this via
stepslisted in "Offsite replication." 

Now I have a happy replicated environment which allows me to do individual pg_dump on each server without having to
movemany gigabytes of PG dump files offsite for disaster purposes. 


QUESTIONS:
Do the steps below (specifically the cascading part) look accurate?
Anyone have recommendations for companies you would use to pay for additional vetting?

Thanks in advance.

Scott Whitney


PS: I have written a multi-proc script (in Python, Linux specific at the moment) for pg_dump that you can use to
pg_dumpand restore said dumps. If anyone's interested, contact me directly. It drastically cuts down the time it takes
pgto back up my cluster. 



Initial Replication:

1)Add to pg_hba.conf on master:
host  replication  postgres  my IP/32  trust

2) Configure in postgresql.conf on master:
wal_level = hot_standby
max_wal_senders = 5
wal_keep_segments = SOME NUMBER I HAVE TO DETERMINE BASED ON HOW LONG THE BACKUP ETC TAKES.

# Maybe do this if the above is not high enough
#archive_mode    = on
#archive_command = 'cp %p /path_to/archive/%f'

3) Add to postgresql.conf on standby:
hot_standby = on

4) Create a recovery.conf on the standby (in $PGDATA):
standby_mode          = 'on'
primary_conninfo      = 'host=master IP port=5432 user=postgres'

# Set this up if I want auto-failover (and, of course, setup something to create that file)
#trigger_file = '/path_to/trigger'

# Set this up if I setup the archive stuff above.
#restore_command = 'cp /path_to/archive/%f "%p"'

5)Do my backup from the master:
$ psql -c "SELECT pg_start_backup('label', true)"
$ rsync -a ${PGDATA}/ standby:/srv/pgsql/standby/ --exclude postmaster.pid
$ psql -c "SELECT pg_stop_backup()"

6) Start pg on the standby and watch replication goodness.




Offsite replication:

1)Add to pg_hba.conf on cascading standby at data center:
host  replication  postgres  IP of offsite server/32  trust

2)Add to postgresql.conf on cascading standby at data center:
wal_level = hot_standby
max_wal_senders = 5
wal_keep_segments = SOME NUMBER I HAVE TO DETERMINE BASED ON HOW LONG THE BACKUP ETC TAKES.

# Maybe do this if the above is not high enough
#archive_mode    = on
#archive_command = 'cp %p /path_to/archive/%f'

3)Add to postgresql.conf on offsite standby:
hot_standby = on

4)Create a recovery.conf on the offsite standby (in $PGDATA):
standby_mode          = 'on'
primary_conninfo      = 'host=data center STANDBY IP port=5432 user=postgres'

# Set this up if I setup the archive stuff above.
#restore_command = 'cp /path_to/archive/%f "%p"'

5)Do my backup from the STANDBY AT THE DATA CETNER
$ psql -c "SELECT pg_start_backup('label', true)"
$ rsync -a ${PGDATA}/ /dev/myUSBstick/ --exclude postmaster.pid
$ psql -c "SELECT pg_stop_backup()"

6)Take the USB stick from the data center to my office and load data

7)Start pg on offsite standby and taste the sinfully chocolately replication goodness!



Re: Migrating to PG 9.2 (LONG, SORRY)

От
Stephen Frost
Дата:
* Scott Whitney (scott@journyx.com) wrote:
> 1)Add to pg_hba.conf on master:
> host  replication  postgres  my IP/32  trust

Never use 'trust'.  At least use 'md5', all that requires is putting a
password into your replication config.

> 2) Configure in postgresql.conf on master:
> wal_level = hot_standby
> max_wal_senders = 5
> wal_keep_segments = SOME NUMBER I HAVE TO DETERMINE BASED ON HOW LONG THE BACKUP ETC TAKES.
>
> # Maybe do this if the above is not high enough
> #archive_mode    = on
> #archive_command = 'cp %p /path_to/archive/%f'

You should definitely have a mechanism for the WALs to be transferred to
the replica(s) on a regular basis.  Don't just rely on the replication
connection.  With this, you don't have to worry as much about the
wal_keep_segments option above...

> 3) Add to postgresql.conf on standby:
> hot_standby = on
>
> 4) Create a recovery.conf on the standby (in $PGDATA):
> standby_mode          = 'on'
> primary_conninfo      = 'host=master IP port=5432 user=postgres'

In primary_conninfo is where you'd put the password for md5 auth.

> 5)Do my backup from the STANDBY AT THE DATA CETNER

I think the typical approach is to use the same copy of the master that
you did for the intial slave when doing the cascaded slave..
Regardless, and with all of this, test, test, test.

    Thanks,

        Stephen

Вложения