Re: pg_basebackup and pg_receivewal timing, missing WAL files

Поиск
Список
Период
Сортировка
От Tim
Тема Re: pg_basebackup and pg_receivewal timing, missing WAL files
Дата
Msg-id CAKhLO5gY4rUS65RUD_OGyHQTxoOa1-akLBWjh-CDQiVDJknecw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: pg_basebackup and pg_receivewal timing, missing WAL files  (Stephen Frost <sfrost@snowman.net>)
Ответы Re: pg_basebackup and pg_receivewal timing, missing WAL files  (Stephen Frost <sfrost@snowman.net>)
Список pgsql-admin
We currently use pgbackrest as our primary backup tool actually, and it works great! The issue I ran into with trying to set it up that way with pgbackrest, is when configuring the second repo, pgbackrest will have to archive all WAL files to it, going back to the oldest backup. And we have this (admittedly arbitrary) legal policy requiring a single instance of a 6 month old backup for auditing purposes. So pgbackrest has to keep WALs that far back, the main repo only needs 4 weeks of WALs for PITR, its a bit overkill archiving WALs twice and also storing them an extra 5 months. So I opted to write a custom backup script.

Though I was not aware of the `--archive-copy` option, I wonder if I can just turn that option on for the main repo, and then extract the particular backup I need to another location. We use Azure Blobs for backups.

 Unfortunately `-Xs` cannot be used with pg_basebackup in TAR format, I'm using the AzCopy utility, by piping pg_basebackup into it, to stream the basebackup directly into Azure Blob storage, without storing it anywhere first (Which pgbackrest does natively as well!). So there is no other way without using the TAR format option.

Nevertheless, it seems like I need the contents of the pg_wal directory at the start of pg_basebackup, since pg_receivewal only streams WAL generated after it starts, the restored DB is missing all the files contained within, I've adjusted my script to copy them as well and testing currently. 

On Thu, Jun 9, 2022 at 2:36 PM Stephen Frost <sfrost@snowman.net> wrote:
Greetings,

* Tim (timfosho@gmail.com) wrote:
> I'm attempting to create a long term, stand alone backup archive script for
> a pretty active and large database (3TB) which will store backups going
> back a few months.
> Using the combination of pg_basebackup in TAR format, with the `Xn` flag,
> and running pg_receivewal in the background, I'm ending up with a backup
> that is missing quite a few WAL files.
>
> Trying to figure out where my understanding of the process is incorrect,
> and what I need to do to correct this.
>
> *Backup *process:
>
>     * Start pg_receivewal to archive location
>     * Then run: `pg_basebackup -h {host} -p {port} -D - -Ft -Xn
> --checkpoint=fast | ...`

What's wrong with using -Xs instead..?  Basically does exactly what you
seem to be trying to do here.

>     * This backup is taken from a *standby node*

Where is the pg_receivewal run?  Against the primary or the replica?

>     * Once pg_basebackup finishes, send SIGINT to pg_receivewal
>
> *Recovery* Process:
>
>     * Untar basebackup to data directory
>     * Copy archived WAL files to a separate directory on the same drive
>     * settings: restore_command = 'cp /var/lib/pgsql/wal_archive/%f %p'
>                      recovery_target = immediate
>
>
> Once I start the DB, I immediately get
>
> cp: cannot stat '/var/lib/pgsql/wal_archive/0000001200003CF80000003F': No
> such file or directory
>
>
>
> *WAL FILES: *
>
> 0000001200003CF80000003F << First WAL file the DB looks for
> 0000001200003CF700000075 << The next WAL file
> ----------------------------------------  << Missing WAL files from
> pg_receivewal archive
> ...
> 0000001200003CF8000000C8 << First WAL file in pg_receivewal archive
>
> *backup_label *file:
>
> START WAL LOCATION: 3CF7/75974400 (file 0000001200003CF700000075)
> CHECKPOINT LOCATION: 3CF8/3F1139C0
> BACKUP METHOD: streamed
> BACKUP FROM: standby
> START TIME: 2022-06-08 12:51:52 EDT
> LABEL: pg_basebackup base backup
> START TIMELINE: 18
>
>
> I can see that START WAL LOCATION & CHECKPOINT LOCATION are where the
> basebackup starts to recover from, but how come these are not in my
> pg_receivewal archive, if I'm starting it before starting the pg_basebackup
> process? How can I ensure they are part of this archive?

Well, using -Xs should do that.

> Since this is a recent backup, I have a separate WAL archive from where I
> can just recover those using a different restore_command, and once that
> missing gap of WAL files is recovered from this archive, stopping the
> server
> and switching back to `restore_command = 'cp /var/lib/pgsql/wal_archive/%f
> %p'` and recovering the rest of the WAL files from the pg_receivewal
> archive works fine. The DB recovers and accepts connections shortly after.
>
> The whole point of this is to have a several months old, stand alone
> backup, I need to have all the WAL files available in that original WAL
> archive created with pg_receivewal.

The use-case generally makes sense, in pgbackrest we have --archive-copy
for more-or-less the same kind of thing, though you can now just use a
separate newly created repo that you back up the standalone backup to
(when multiple repos are configured, archive-push will archive to all of
them) and then archive that copy.  The latter is what we'd generally
recommend these days, but with archive-copy and a full backup, you can
just grab the full backup directory from inside the repo and it'll have
everything.

Thanks,

Stephen

В списке pgsql-admin по дате отправления:

Предыдущее
От: Pascal CROZET
Дата:
Сообщение: ERROR: type "my_user_type" does not exist on REFRESH MATERIALIZED VIEW
Следующее
От: Stephen Frost
Дата:
Сообщение: Re: pg_basebackup and pg_receivewal timing, missing WAL files