I've posted the full command which replicates this issue on the GitHub issue:
c.run(f'rm -rf {p.remote_data}/pg12')
c.run(f'mkdir -p {p.remote_data}/pg12')
c.run( f'docker run --rm ' f'--name {p.instance_name} ' f'-v {p.remote_data}:/data ' f'-v {p.remote_shared}:/shared ' + f'{p.repo_name}:{version} ' 'pgbackrest --stanza=app --log-level-console=info --log-level-file=off restore',
)
c.run( f'docker run --rm ' f'--name {p.instance_name} ' f'-v {p.remote_data}:/data ' f'-v {p.remote_shared}:/shared ' '--user=postgres ' f'{p.repo_name}:{version} ' '/usr/lib/postgresql/12/bin/postgres -D /data/pg12',
)
c.run()
is just SSH automation (via Fabric), so these commands are the same as I'd run in the terminal.
> Zsolt, can you add --log-level-console=info to your archive_command so
we can see what parameters are being passed to pgbackrest?
I changed it to
archive_command = 'pgbackrest --stanza=app --log-level-console=info archive-push %p'
on the destination server (where the restore happens). There are no changes in the log, only 3 lines appear once everything is done. I've attached the log. Do I need to change this command on the source server?
Zsolt
On 7/22/22 23:36, Michael Paquier wrote:
On Fri, Jul 22, 2022 at 04:24:54AM -0500, Zsolt Ero wrote:
> I've opened an issue on pgbackrest, where the developer confirmed that
> those files are not touched by pgbackrest, they are basically copied back
> exactly as they were. The backup/restore is on the same partition as the
> data folder, so no chance of file system changes, etc. Single master, no
> replication.
>
> I'm always doing clean data dir + init_db so there is nothing there before
> the restore. I can replicate this behaviour 100%.
>
> PostgreSQL version: psql (PostgreSQL) 12.11 (Ubuntu 12.11-1.pgdg18.04+1)
>
> OS is Ubuntu 18.04
>
> pgbackrest issue: https://github.com/pgbackrest/pgbackrest/issues/1815
>
> log details:
>
> could not link file "pg_wal/000000010000015600000098" to
> "pg_wal/00000001000001570000006E": File exists
FWIW, the backend code has protections to prevent *exactly* this kind
of problems when recycling WAL segment files at checkpoints with a set
of LWLocks taken on the control file, for one. Perhaps you have
messed up things and you have finished in such a state that backrest
writes to pg_wal/ concurrently with a cluster running and running a
checkpoint, which would explain those link() calls to be failing?
During recovery pgbackrest only writes into pg_wal when invoked by
restore_command. That generally means pgbackrest should only be writing
to RECOVERYXLOG.
Zsolt, can you add --log-level-console=info to your archive_command so
we can see what parameters are being passed to pgbackrest? This logging
will appear in the postgres log.
Regards,
-David