Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)

Поиск
Список
Период
Сортировка
От Bharath Rupireddy
Тема Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)
Дата
Msg-id CALj2ACWryGZ23dRZk3SMxMLgntOwM3fq+QD3Rbqr+-VOgfbTrw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Ответы Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)
Список pgsql-hackers
On Mon, Nov 13, 2023 at 5:43 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Mon, Nov 6, 2023 at 9:09 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Sun, Nov 5, 2023 at 4:01 AM Bharath Rupireddy
> > <bharath.rupireddyforpostgres@gmail.com> wrote:
> > >
> > > On Thu, Nov 2, 2023 at 7:19 AM Peter Smith <smithpb2250@gmail.com> wrote:
> > > >
> > > > But that's not quite compatible with what Alvaro [2] had written long
> > > > back ("... the only acquisitions that would log messages are those in
> > > > StartReplication and StartLogicalReplication.").
> > > >
> > > > In other words, ReplicationSlotAcquire/ReplicationSlotRelease is
> > > > called by more places than you care to log from.
> > >
> > > I refreshed my thoughts for this patch and I think it's enough if
> > > walsenders alone log messages when slots become active and inactive.
> > > How about something like the attached v11 patch?
> > >
> >
> > + * This function is currently used only in walsender.
> > + */
> > +void
> > +ReplicationSlotAcquireAndLog(const char *name, bool nowait)
> >
> > BTW, is the reason for using it only in walsender is that it is a
> > background process and it is not very apparent whether the slot is
> > created, and for foreground processes, it is a bit clear when the
> > command is executed.
> >
> > Can you please tell me the use case of this additional message?
>
> Replication slot acquisitions and releases by backends say when
> running pg_replication_slot_advance or pg_logical_slot_get_changes or
> pg_drop_replication_slot or pg_create_{physical,
> logical}_replication_slot are transient unlike walsenders which
> comparatively hold slots for longer durations. Therefore, I've added
> them only for walsenders. These messages help to know the lifetime of
> a replication slot - one can know how long a streaming standby or
> logical subscriber is down, IOW, how long a replication slot is
> inactive in production. For instance, the time between released and
> acquired slots in the below messages is the inactive replication slot
> duration.
>
> 2023-11-13 11:06:34.338 UTC [470262] LOG:  acquired physical
> replication slot "sb_repl_slot"
> 2023-11-13 11:06:34.338 UTC [470262] STATEMENT:  START_REPLICATION
> SLOT "sb_repl_slot" 0/3000000 TIMELINE 1
> 2023-11-13 11:09:24.918 UTC [470262] LOG:  released physical
> replication slot "sb_repl_slot"
> 2023-11-13 12:01:40.530 UTC [470967] LOG:  acquired physical
> replication slot "sb_repl_slot"
> 2023-11-13 12:01:40.530 UTC [470967] STATEMENT:  START_REPLICATION
> SLOT "sb_repl_slot" 0/3000000 TIMELINE 1
>
> > If so, the other alternative is to either use a
> > parameter to the existing function or directly use am_walsender flag
> > to distinguish when to print the message in acquire/release calls.
>
> Done that way. PSA v12.
>
> > A few other minor comments:
> > 1.
> > +        Causes each replication command and related activity to be logged in
> > +        the server log.
> >
> > Can we be bit more specific by changing to something like: "Causes
> > each replication command and slot acquisition/release to be logged in
> > the server log."
>
> Done.
>
> > 2.
> > + ereport(log_replication_commands ? LOG : DEBUG1,
> > + (errmsg("walsender process with PID %d acquired %s replication slot \"%s\"",
> >
> > It seems PID and process name is quite unlike what we print in other
> > similar messages. For example, see below messages when we start
> > replication via pub/sub :
> >
> > We can get the PID from the log line as for other logs and I don't see
> > the process name printed anywhere else.
>
> There was a comment upthread to have PID printed, but I agree to be
> consistent and changed the messages to be: acquired physical/logical
> replication slot "foo" and released physical/logical replication slot
> "foo".
>
> PSA v12 patch.

Compiler isn't happy with v12
https://cirrus-ci.com/task/5543061376204800?logs=gcc_warning#L405. PSA
v13 patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Zhijie Hou (Fujitsu)"
Дата:
Сообщение: RE: Synchronizing slots from primary to standby
Следующее
От: "Kumar, Sachin"
Дата:
Сообщение: Re: pg_upgrade failing for 200+ million Large Objects