Обсуждение: could not stat file "pg_wal/RECOVERYHISTORY": No such file or director

Поиск
Список
Период
Сортировка

could not stat file "pg_wal/RECOVERYHISTORY": No such file or director

От
Dmitry Litvintsev
Дата:
Hello,

I have been setting PITR backup off of base backup. Wal files are shipped to the host running pg_basebackup pulling
backupfrom  
master host.

Once base backup finished I wanted to try recovery.

I added

restore_command = 'gunzip /data/xlogs/%f.gz > %p'

(that command is wrapped in a script that checks existence of input and exits w/ 0 if it does not as per instructions)

I did move pg_wal out of the way and re-created it:

mv pg_wal pg_wal.bak
mkdir pg_wal

I did touch recovery signal:

touch recovery.signal

I see in the log:


< 2025-10-23 16:05:30.750 CDT  2876871 > LOG:  database system was interrupted; last known up at 2025-10-23 02:00:41
CDT
< 2025-10-23 16:05:31.582 CDT  2876871 > LOG:  could not stat file "pg_wal/RECOVERYHISTORY": No such file or directory
< 2025-10-23 16:05:31.582 CDT  2876871 > DETAIL:  restore_command returned a zero exit status, but stat() failed.
< 2025-10-23 16:05:31.582 CDT  2876871 > LOG:  starting archive recovery
< 2025-10-23 16:05:31.583 CDT  2876871 > LOG:  starting backup recovery with redo LSN 20975/FE0AC690, checkpoint LSN
20975/FE21CA88,on timeline ID 4 
< 2025-10-23 16:05:31.586 CDT  2876871 > LOG:  could not stat file "pg_wal/RECOVERYHISTORY": No such file or directory
< 2025-10-23 16:05:31.586 CDT  2876871 > DETAIL:  restore_command returned a zero exit status, but stat() failed.
< 2025-10-23 16:05:31.798 CDT  2876871 > LOG:  restored log file "0000000400020975000000FE" from archive
< 2025-10-23 16:05:31.819 CDT  2876871 > LOG:  redo starts at 20975/FE0AC690
< 2025-10-23 16:05:33.927 CDT  2876871 > LOG:  restored log file "0000000400020975000000FF" from archive

it seems to be churning on, but I am a bit worried about :

could not stat file "pg_wal/RECOVERYHISTORY": No such file or directory

I do not recall having to deal with this in the past? Should I have manually created

pg_wal/archive_status

my old scripts seem to indicate the need for this - I am trying to resurrect
PITR after multiple upgrades and not running it for some time (relying in replica instead)

So, is it harmless? Postgresql does not fail, the recovery is still ongoing.

This is postgresql 15.

Additional question.... as the wall file are continuously getting shipped to the backup host,
will this recovery ever end?  (I neglected to set recovery_target_time)

Thank you very much in advance





Re: could not stat file "pg_wal/RECOVERYHISTORY": No such file or director

От
Laurenz Albe
Дата:
On Thu, 2025-10-23 at 21:54 +0000, Dmitry Litvintsev wrote:
> I have been setting PITR backup off of base backup. Wal files are shipped to
> the host running pg_basebackup pulling backup from master host.
>
> Once base backup finished I wanted to try recovery.
>
> I added
>
> restore_command = 'gunzip /data/xlogs/%f.gz > %p'
>
> (that command is wrapped in a script that checks existence of input and exits w/ 0 if it does not as per
instructions)
>
> I see in the log:
>
>
> < 2025-10-23 16:05:30.750 CDT  2876871 > LOG:  database system was interrupted; last known up at 2025-10-23 02:00:41
CDT
> < 2025-10-23 16:05:31.582 CDT  2876871 > LOG:  could not stat file "pg_wal/RECOVERYHISTORY": No such file or
directory
> < 2025-10-23 16:05:31.582 CDT  2876871 > DETAIL:  restore_command returned a zero exit status, but stat() failed.
> < 2025-10-23 16:05:31.582 CDT  2876871 > LOG:  starting archive recovery
> < 2025-10-23 16:05:31.583 CDT  2876871 > LOG:  starting backup recovery with redo LSN 20975/FE0AC690, checkpoint LSN
20975/FE21CA88,on timeline ID 4 
> < 2025-10-23 16:05:31.586 CDT  2876871 > LOG:  could not stat file "pg_wal/RECOVERYHISTORY": No such file or
directory
> < 2025-10-23 16:05:31.586 CDT  2876871 > DETAIL:  restore_command returned a zero exit status, but stat() failed.
> < 2025-10-23 16:05:31.798 CDT  2876871 > LOG:  restored log file "0000000400020975000000FE" from archive
> < 2025-10-23 16:05:31.819 CDT  2876871 > LOG:  redo starts at 20975/FE0AC690
> < 2025-10-23 16:05:33.927 CDT  2876871 > LOG:  restored log file "0000000400020975000000FF" from archive
>
> it seems to be churning on, but I am a bit worried about :
>
> could not stat file "pg_wal/RECOVERYHISTORY": No such file or directory

I read the code, and that means the following:

- PostgreSQL tries to restore a *.history file to get the timeline history
- it calls "restore_command" to get the next history file and names it RECOVERYHISTORY locally
- "restore_command" exits with a return code 0, but the file is not in "pg_wal"

So it is a way for archive restore to fail, but an unusual way.
Normally, you would expect "restore_command" to have created the file if it returns 0.
So you should improve your "restore_command" if you don't want to see that message.

Yours,
Laurenz Albe