Re: Unexpected Standby Shutdown on sync_replication_slots change

Поиск
Список
Период
Сортировка
От shveta malik
Тема Re: Unexpected Standby Shutdown on sync_replication_slots change
Дата
Msg-id CAJpy0uBY7_V8fdA6X2Ajq3zaEgSp8wyUVqoVGM-bhoBUoDt5dw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Unexpected Standby Shutdown on sync_replication_slots change  (Fujii Masao <masao.fujii@gmail.com>)
Ответы Re: Unexpected Standby Shutdown on sync_replication_slots change
Список pgsql-bugs
On Fri, Jul 25, 2025 at 12:20 AM Fujii Masao <masao.fujii@gmail.com> wrote:
>
> On Fri, Jul 25, 2025 at 12:55 AM Fujii Masao <masao.fujii@gmail.com> wrote:
> >
> > On Thu, Jul 24, 2025 at 10:54 PM Hugo DUBOIS <hdubois@scaleway.com> wrote:
> > >
> > > Hello,
> > >
> > > I'm not sure if it's a bug but I've encountered an unexpected behavior when dynamically changing the
sync_replication_slotsparameter on a PostgreSQL 17 standby server. Instead of logging an error and continuing to run,
thestandby instance shuts down with a FATAL error, which is not the anticipated behavior for a dynamic parameter
change,especially when the documentation doesn't indicate such an outcome. 
> > >
> > > Steps to Reproduce
> > >
> > > Set up a physical replication between two PostgreSQL 17.5 instances.
> > >
> > > Ensure wal_level on the primary (and consequently on the standby) is set to replica.
> > >
> > > Start both the primary and standby instances, confirming replication is active.
> > >
> > > On the standby instance, dynamically change the sync_replication_slots parameter (I have run the following query:
ALTERSYSTEM SET sync_replication_slots = 'on'; followed by SELECT pg_reload_conf();) 
> > >
> > > Expected Behavior
> > >
> > > I expected the standby instance to continue running and log an error message (similar to how hot_standby_feedback
behaveswhen not enabled, e.g., a loop of LOG: replication slot synchronization requires "hot_standby_feedback" to be
enabled).A FATAL error leading to an unexpected shutdown for a dynamic parameter change on a running standby is not the
anticipatedbehavior. The documentation for sync_replication_slots also doesn't indicate that a misconfiguration or
incompatiblewal_level would lead to a shutdown. 
> > >
> > > Actual Behavior
> > >
> > > Upon attempting to set sync_replication_slots to on on the standby with wal_level set to replica, the standby
instanceimmediately shuts down with the following log messages: 
> > >
> > > LOG:  database system is ready to accept read-only connections
> > > LOG:  started streaming WAL from primary at 0/3000000 on timeline 1
> > > LOG:  received SIGHUP, reloading configuration files
> > > LOG:  parameter "sync_replication_slots" changed to "on"
> > > FATAL:  replication slot synchronization requires "wal_level" >= "logical"
> > >
> > > Environment
> > >
> > > PostgreSQL Version: 17.5
> >
> > Thanks for the report!
> >
> > I was able to reproduce the issue even on the latest master (v19dev).
> > I agree that the current behavior—where changing a GUC parameter can
> > cause the server to shut down—is unexpected and should be avoided.
> >
> > From what I’ve seen in the code, the problem stems from postmaster
> > calling ValidateSlotSyncParams() before starting the slot sync worker.
> > That function raises an ERROR if wal_level is not logical while
> > sync_replication_slots is enabled. Since ERROR is treated as FATAL
> > in postmaster, it causes the server to exit.
> >
> > To fix this, we could modify ValidateSlotSyncParams() so it doesn’t
> > raise an ERROR in this case, as follows.
> >
> >  ValidateSlotSyncParams(int elevel)
> >  {
> >   /*
> >   * Logical slot sync/creation requires wal_level >= logical.
> > - *
> > - * Since altering the wal_level requires a server restart, so error out in
> > - * this case regardless of elevel provided by caller.
> >   */
> >   if (wal_level < WAL_LEVEL_LOGICAL)
> > - ereport(ERROR,
> > + {
> > + ereport(elevel,
> >   errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> >   errmsg("replication slot synchronization requires \"wal_level\" >=
> > \"logical\""));
> > + return false;
> > + }
>
> I've created a patch to implement the above—attached.

Thank You for the patch.

> Note that this patch does not change the existing behavior when
> the misconfiguration (sync_replication_slots enabled but wal_level not
> set to logical) is detected at server startup. In that case, the server
> still shuts down with a FATAL error, which is consistent with other
> settings like summarize_wal.
>

Validated the behaviour, the patch looks good to me.

thanks
Shveta



В списке pgsql-bugs по дате отправления: