The following bug has been logged on the website:
Bug reference: 17577
Logged by: Daniel Farina
Email address: daniel@fdr.io
PostgreSQL version: 14.4
Operating system: AlmaLinux 8.6
Description:
Reproduction:
1) Back up a new, empty initdb server
2) Run pgbench in mixed mode for a while to generate WAL, possibly for
a long time.
3) Set up a replica
(Side note: In my case, though it may or may not be important, I
also have a primary_conninfo defined and standby.signal. The
primary_conninfo is not used, however, as the server never catches
up enough to do that before I run pg_ctl promote).
4) Wait for consistency. It should take a short while given the backup
is of an empty database.
5) try to run pg_ctl promote while the server is in archive restore
6) it will block until timeout, and not promote until restore_command exits
abnormally
Other notes:
upon running pg_ctl promote, the "promote" file is written, but the
"server has received promote request" message is not written to the
logs.
A workaround:
Killing the restore_command, i.e. injecting a non-zero exit code, will
cause postgres to print the "has received promote request" message and
go through promotion.
Probable cause: something is not checking for pg_ctl promote having
been run as often as it should when WAL is being sourced from
restore_command, but it does get checked when postgres does its
expected actions when receiving a non-zero exit code,
e.g. checking whether it should switch to streaming.