Обсуждение: BUG #19432: recovery fails at invalid checkpoint record
The following bug has been logged on the website: Bug reference: 19432 Logged by: Felix Hamme Email address: felix.hamme@ionos.com PostgreSQL version: 17.9 Operating system: Debian 13 Description: Hi, I'm trying to restore from a pg_basebackup at timeline 1 to a restore_target_time in timeline 2. It fails at "invalid checkpoint record", "could not locate required checkpoint record at 0/3000080". All relevant wal files are in the archive, the restore_command works and backup_label, pg_controldata, pg_waldump and 00000002.history look like everything should work. Recovering only timeline 1 works, but it fails as soon as it should proceed in timeline 2. A 6.7MB tar of the basebackup and the wal archive is available at https://get.hidrive.com/i/PwMejRQG . This link expires on 2026-03-13, I can provide a new link if needed. Why does this recovery fail?
On Thu, 2026-03-12 at 16:20 +0000, PG Bug reporting form wrote: > Hi, I'm trying to restore from a pg_basebackup at timeline 1 to a > restore_target_time in timeline 2. > It fails at "invalid checkpoint record", "could not locate required > checkpoint record at 0/3000080". > All relevant wal files are in the archive, the restore_command works and > backup_label, pg_controldata, pg_waldump and 00000002.history look like > everything should work. > Recovering only timeline 1 works, but it fails as soon as it should proceed > in timeline 2. > A 6.7MB tar of the basebackup and the wal archive is available at > https://get.hidrive.com/i/PwMejRQG . This link expires on 2026-03-13, I can > provide a new link if needed. > Why does this recovery fail? Funny. I unpacked your data directory and reduced your postgresql.auto.conf to something that fits my system: log_min_messages = 'DEBUG5' restore_command = 'cp /home/laurenz/hamme/fakearchive/%f %p' recovery_target_time = '2026-03-11 14:51:28 UTC' recovery_target_action = 'promote' hot_standby_feedback = 'on' log_destination = 'csvlog' log_directory = '/home/laurenz/hamme/log' logging_collector = 'on' wal_level = 'logical' port = 5433 unix_socket_directories = '/home/laurenz/hamme' max_connections = 300 Recovery worked like a charm. pg_waldump shows the checkpoint record in 000000010000000000000003 at the correct position. Not sure what you did wrong. Yours, Laurenz Albe
Thank you for checking, now I found what I did wrong: "mv" doesn't work as a restore_command because the same .history file is restored multiple times during recovery. I successfully recovered using a restore_command which does a cp for history files and mv for wal files. It logged this: cp /DBDATA/test/fakearchive/00000002.history pg_wal/RECOVERYHISTORY success cp /DBDATA/test/fakearchive/00000003.history pg_wal/RECOVERYHISTORY not found cp /DBDATA/test/fakearchive/00000002.history pg_wal/RECOVERYHISTORY success mv /DBDATA/test/fakearchive/000000010000000000000003 pg_wal/RECOVERYXLOG success mv /DBDATA/test/fakearchive/000000010000000000000004 pg_wal/RECOVERYXLOG success mv /DBDATA/test/fakearchive/000000020000000000000005 pg_wal/RECOVERYXLOG success mv /DBDATA/test/fakearchive/000000020000000000000006 pg_wal/RECOVERYXLOG success mv /DBDATA/test/fakearchive/000000020000000000000007 pg_wal/RECOVERYXLOG success cp /DBDATA/test/fakearchive/00000003.history pg_wal/RECOVERYHISTORY not found cp /DBDATA/test/fakearchive/00000002.history pg_wal/RECOVERYHISTORY success Is it safe in general to use mv for wal files? In other words, do the currently supported postgres versions run restore_command only once per wal file? Best regards Felix Hamme On Thu, Mar 12, 2026 at 8:29 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote: > > On Thu, 2026-03-12 at 16:20 +0000, PG Bug reporting form wrote: > > Hi, I'm trying to restore from a pg_basebackup at timeline 1 to a > > restore_target_time in timeline 2. > > It fails at "invalid checkpoint record", "could not locate required > > checkpoint record at 0/3000080". > > All relevant wal files are in the archive, the restore_command works and > > backup_label, pg_controldata, pg_waldump and 00000002.history look like > > everything should work. > > Recovering only timeline 1 works, but it fails as soon as it should proceed > > in timeline 2. > > A 6.7MB tar of the basebackup and the wal archive is available at > > https://get.hidrive.com/i/PwMejRQG . This link expires on 2026-03-13, I can > > provide a new link if needed. > > Why does this recovery fail? > > Funny. I unpacked your data directory and reduced your postgresql.auto.conf > to something that fits my system: > > log_min_messages = 'DEBUG5' > restore_command = 'cp /home/laurenz/hamme/fakearchive/%f %p' > recovery_target_time = '2026-03-11 14:51:28 UTC' > recovery_target_action = 'promote' > hot_standby_feedback = 'on' > log_destination = 'csvlog' > log_directory = '/home/laurenz/hamme/log' > logging_collector = 'on' > wal_level = 'logical' > port = 5433 > unix_socket_directories = '/home/laurenz/hamme' > max_connections = 300 > > Recovery worked like a charm. pg_waldump shows the checkpoint record in > 000000010000000000000003 at the correct position. > > Not sure what you did wrong. > > Yours, > Laurenz Albe
On Fri, 2026-03-13 at 09:35 +0100, Felix Hamme wrote: > Is it safe in general to use mv for wal files? In other words, do the > currently supported postgres versions run restore_command only once > per wal file? As you found out, no... Yours, Laurenz Albe
Timeline history files can be needed multiple times, ok. My question was about WAL files only. I'm tempted to use a restore_command which does cp for history files and mv for WAL files, to optimize performance and disk usage. An AI told me that a second restore attempt for the same WAL file could only happen if recovery is resumed after a crash. Kind regards Felix Hamme On Fri, Mar 13, 2026 at 5:37 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote: > > On Fri, 2026-03-13 at 09:35 +0100, Felix Hamme wrote: > > Is it safe in general to use mv for wal files? In other words, do the > > currently supported postgres versions run restore_command only once > > per wal file? > > As you found out, no... > > Yours, > Laurenz Albe
On Mon, 2026-03-16 at 14:56 +0100, Felix Hamme wrote: > I'm tempted to use a restore_command which does cp for history files > and mv for WAL files, to optimize performance and disk usage. > An AI told me that a second restore attempt for the same WAL file > could only happen if recovery is resumed after a crash. Don't do that. Make the restore_command idempotent. Trying to optimize for storage space often causes problems elsewhere. Yours, Laurenz Albe