Обсуждение: Missing wal_files Postgres 9.2

Поиск
Список
Период
Сортировка

Missing wal_files Postgres 9.2

От
Patrick B
Дата:
Hi guys,



I'm currently using PostgreSQL 9.2.

One of my backup servers went down, and I had to re-sync the all the DB.
I used pg_basebackup and, of course, at the same time wal_archive.

Steps I did:

1 - Confirm the wal_files are being copied into the new server.
2 - Delete /var/lib/pgsql/9.2/data/*
3 - Double check the wal_files are ok into the server
4 - Start pg_basebackup
5 - pg_basebackup took 10 days, as our DB is 2.0TB.
6 - Start postgres with the recovery.conf.restore_comand
7 - Wait for the wal_files being processed

I got an error here: 00000002000013B40000001A` not found

How is that possible? If I double checked!! :(
I can't find that file anywhere.. and now I'll have to re-do all the work =(


Questions: 

1 - Is there any way to restore the DB without that file?

2 - If I'll have to re-do all the steps above, is there any way to check if the wal_files exists once the pg_basebackup has started?
Once the pg_basebackup command is running, I could see if the next wal_file would be available... it would be easier to cancel the command or to restart it if I need to.

Can I check the next wal_file by catting the backup_label files?

cat backup_label: 

START WAL LOCATION: 13B4/EBEECA8 (file 00000002000013B40000000E)
CHECKPOINT LOCATION: 13B4/1AACC868
BACKUP METHOD: streamed
BACKUP FROM: standby
START TIME: 2016-06-24 04:04:48 UTC


Thanks

Re: Missing wal_files Postgres 9.2

От
Keith
Дата:


On Sun, Jul 3, 2016 at 7:30 PM, Patrick B <patrickbakerbr@gmail.com> wrote:
Hi guys,



I'm currently using PostgreSQL 9.2.

One of my backup servers went down, and I had to re-sync the all the DB.
I used pg_basebackup and, of course, at the same time wal_archive.

Steps I did:

1 - Confirm the wal_files are being copied into the new server.
2 - Delete /var/lib/pgsql/9.2/data/*
3 - Double check the wal_files are ok into the server
4 - Start pg_basebackup
5 - pg_basebackup took 10 days, as our DB is 2.0TB.
6 - Start postgres with the recovery.conf.restore_comand
7 - Wait for the wal_files being processed

I got an error here: 00000002000013B40000001A` not found

How is that possible? If I double checked!! :(
I can't find that file anywhere.. and now I'll have to re-do all the work =(


Questions: 

1 - Is there any way to restore the DB without that file?

2 - If I'll have to re-do all the steps above, is there any way to check if the wal_files exists once the pg_basebackup has started?
Once the pg_basebackup command is running, I could see if the next wal_file would be available... it would be easier to cancel the command or to restart it if I need to.

Can I check the next wal_file by catting the backup_label files?

cat backup_label: 

START WAL LOCATION: 13B4/EBEECA8 (file 00000002000013B40000000E)
CHECKPOINT LOCATION: 13B4/1AACC868
BACKUP METHOD: streamed
BACKUP FROM: standby
START TIME: 2016-06-24 04:04:48 UTC


Thanks


Look into using the --xlog-method=stream (-Xs) option to pg_basebackup. That opens up a second replication stream and keeps all the WAL files generated during the basebackup so that you can bring up your basebackup to a consistent state at the point when the backup finishes. You'll have to be sure --max_wal_senders is set high enough to allow 2 replication connections during the backup.

You then also have to bring that slave up before the master removes all the WAL files from the pg_xlogs folder that that slave may need for replay to sync itself with the master. If you're not going to be bringing that slave up rather quickly after the base-backup, then you're going to have to archive your WAL files off somewhere else that the slave will have access to until its ready.

If you can get upgraded to 9.4, you can use replication slots and not have to worry about the master removing WALs from pg_xlog when the associated slave is down.

Keith

Re: Missing wal_files Postgres 9.2

От
Patrick B
Дата:

Look into using the --xlog-method=stream (-Xs) option to pg_basebackup. That opens up a second replication stream and keeps all the WAL files generated during the basebackup so that you can bring up your basebackup to a consistent state at the point when the backup finishes. You'll have to be sure --max_wal_senders is set high enough to allow 2 replication connections during the backup.

hmm.. that's a better idea! I was using with the fletch mode.
 

You then also have to bring that slave up before the master removes all the WAL files from the pg_xlogs folder that that slave may need for replay to sync itself with the master. If you're not going to be bringing that slave up rather quickly after the base-backup, then you're going to have to archive your WAL files off somewhere else that the slave will have access to until its ready.

I store the wal_files into each server (3 slaves).
So each slave has the wal_files for 24h and them delete them.

 

If you can get upgraded to 9.4, you can use replication slots and not have to worry about the master removing WALs from pg_xlog when the associated slave is down.


I can't now.. it's a plan but only for the future (migration to AWS).
 
Keith


Thanks for you reply Keith. 

Re: Missing wal_files Postgres 9.2

От
Patrick B
Дата:
slave_new: server that needs a new copy of the DB
slave01: streaming replication slave

My steps are:

1. ssh slave_new
2. Stop postgres
3. rm -rf /var/lib/pgsql/9.2/data/*
4. ssh postgres@slave01 'pg_basebackup --pgdata=- --format=tar --label=slave_new --progress --host=localhost --port=5432 --username=replicator --xlog-method=stream' | tar -x --no-same-owner
5. Once the STEP 4 is done, copy the original postgresql.conf pg_hba.conf recovery.conf into /var/lib/pgsql/9.2/data
6. Set recovery,conf with restored command.
7. Start Postgres
I should be able to see the DB recovering itself from the wal_files via LOGS

Is that right?