pg_basebackup and pg_receivewal timing, missing WAL files

Поиск
Список
Период
Сортировка
От Tim
Тема pg_basebackup and pg_receivewal timing, missing WAL files
Дата
Msg-id CAKhLO5itE_xp7KX7Vxnho8Uzczgwt8iUD9bVtKNPamBOst=hPA@mail.gmail.com
обсуждение исходный текст
Ответы Re: pg_basebackup and pg_receivewal timing, missing WAL files  (Stephen Frost <sfrost@snowman.net>)
Список pgsql-admin
Hello all,

I'm attempting to create a long term, stand alone backup archive script for a pretty active and large database (3TB) which will store backups going back a few months. 
Using the combination of pg_basebackup in TAR format, with the `Xn` flag, and running pg_receivewal in the background, I'm ending up with a backup that is missing quite a few WAL files.

Trying to figure out where my understanding of the process is incorrect, and what I need to do to correct this. 

Backup process:
    * Start pg_receivewal to archive location
    * Then run: `pg_basebackup -h {host} -p {port} -D - -Ft -Xn --checkpoint=fast | ...`
    * This backup is taken from a standby node 
    * Once pg_basebackup finishes, send SIGINT to pg_receivewal

Recovery Process:
    * Untar basebackup to data directory 
    * Copy archived WAL files to a separate directory on the same drive
    * settings: restore_command = 'cp /var/lib/pgsql/wal_archive/%f %p'
                     recovery_target = immediate

Once I start the DB, I immediately get 
cp: cannot stat '/var/lib/pgsql/wal_archive/0000001200003CF80000003F': No such file or directory
    
WAL FILES: 
0000001200003CF80000003F << First WAL file the DB looks for 
0000001200003CF700000075 << The next WAL file 
----------------------------------------  << Missing WAL files from pg_receivewal archive
...
0000001200003CF8000000C8 << First WAL file in pg_receivewal archive 

backup_label file:
START WAL LOCATION: 3CF7/75974400 (file 0000001200003CF700000075)
CHECKPOINT LOCATION: 3CF8/3F1139C0
BACKUP METHOD: streamed
BACKUP FROM: standby
START TIME: 2022-06-08 12:51:52 EDT
LABEL: pg_basebackup base backup
START TIMELINE: 18

I can see that START WAL LOCATION & CHECKPOINT LOCATION are where the basebackup starts to recover from, but how come these are not in my pg_receivewal archive, if I'm starting it before starting the pg_basebackup process? How can I ensure they are part of this archive? 

Since this is a recent backup, I have a separate WAL archive from where I can just recover those using a different restore_command, and once that missing gap of WAL files is recovered from this archive, stopping the server 
and switching back to `restore_command = 'cp /var/lib/pgsql/wal_archive/%f %p'` and recovering the rest of the WAL files from the pg_receivewal archive works fine. The DB recovers and accepts connections shortly after. 

The whole point of this is to have a several months old, stand alone backup, I need to have all the WAL files available in that original WAL archive created with pg_receivewal.  


Appreciate the help, thank you in advance!

В списке pgsql-admin по дате отправления:

Предыдущее
От: Arun Maddukuri
Дата:
Сообщение: pg_upgrade and missing loadable libraries
Следующее
От: "David G. Johnston"
Дата:
Сообщение: Re: pg_upgrade and missing loadable libraries