At Sat, 23 Jul 2022 12:36:47 +0900, Michael Paquier <michael@paquier.xyz> wrote in
> FWIW, the backend code has protections to prevent *exactly* this kind
> of problems when recycling WAL segment files at checkpoints with a set
> of LWLocks taken on the control file, for one. Perhaps you have
> messed up things and you have finished in such a state that backrest
> writes to pg_wal/ concurrently with a cluster running and running a
> checkpoint, which would explain those link() calls to be failing?
That lock doesn't seem excluding recovery.
I can reproduce with the following script (see below) with some sleep
is added before (or after) durable_link_or_rename call in
InstallXlogFileSegment (attached). Some adjustment might be required
to reproduce the same on other environment.
=====
2022-07-25 17:05:57.730 JST [151758] LOG: restored log file "000000010000000000000057" from archive
2022-07-25 17:05:57.760 JST [151758] LOG: restored log file "000000010000000000000058" from archive
2022-07-25 17:05:57.782 JST [151758] LOG: restored log file "000000010000000000000059" from archive
2022-07-25 17:05:57.790 JST [151762] LOG: could not link file "pg_wal/000000010000000000000002" to
"pg_wal/000000010000000000000059":File exists
2022-07-25 17:05:57.802 JST [151758] LOG: restored log file "00000001000000000000005A" from archive
2022-07-25 17:05:58.294 JST [151762] LOG: could not link file "pg_wal/000000010000000000000003" to
"pg_wal/00000001000000000000005A":File exists
========
#! /bin/bash
# create a backup-source
PGDATA=~/test/data
PGARC=~/test/arc
BKDIR=~/test/bk
CPDATA=~/test/dt
rm /tmp/hoge
rm -r $PGDATA $PGARC $BKDIR $CPDATA
mkdir $PGARC
killall -9 postgres
initdb -D $PGDATA
echo "archive_mode=on" >> $PGDATA/postgresql.conf
echo "archive_command = 'cp %p $PGARC/%f'" >> $PGDATA/postgresql.conf
#start the source
pg_ctl -D $PGDATA start
# take a backup
pg_basebackup -D $BKDIR
echo "archive_mode=off" >> $BKDIR/postgresql.conf
echo "restore_command='cp $PGARC/%f %p'" >> $BKDIR/postgresql.conf
touch $BKDIR/recovery.signal
# create archived segments
psql -c 'create table t (a int)'
for i in $(seq 1 100); do psql -c 'insert into t values(0); select pg_switch_wal()'; done
#stop the source
pg_ctl -D $PGDATA stop
# start recovery
rm -rf $CPDATA
cp -r $BKDIR $CPDATA
touch /tmp/hoge
postgres -D $CPDATA 2>&1 | tee recovery.log
======
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center