Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?

Поиск
Список
Период
Сортировка
От Kyotaro Horiguchi
Тема Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?
Дата
Msg-id 20220105.154341.410117452632618588.horikyota.ntt@gmail.com
обсуждение исходный текст
Ответ на Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Ответы Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Список pgsql-hackers
At Tue, 28 Dec 2021 12:28:07 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in 
> On Wed, Dec 15, 2021 at 8:32 AM Kyotaro Horiguchi
> <horikyota.ntt@gmail.com> wrote:
> > > Here's the patch that adds a LOG message whenever a replication slot
> > > becomes active and inactive. These logs will be extremely useful on
> > > production servers to debug and analyze inactive replication slot
> > > issues.
> > >
> > > Thoughts?
> >
> > If I create a replication slot, I saw the following lines in server log.
> >
> > [22054:client backend] LOG:  replication slot "s1" becomes active
> > [22054:client backend] DETAIL:  The process with PID 22054 acquired it.
> > [22054:client backend] STATEMENT:  select pg_drop_replication_slot('s1');
> > [22054:client backend] LOG:  replication slot "s1" becomes inactive
> > [22054:client backend] DETAIL:  The process with PID 22054 released it.
> > [22054:client backend] STATEMENT:  select pg_drop_replication_slot('s1');
> >
> > They are apparently too much as if they were DEBUG3 lines.  The
> > process PID shown is of the process the slot operations took place so
> > I think it conveys no information.  The STATEMENT lines are also noisy
> > for non-ERROR emssages.  Couldn't we hide that line?
> >
> > That is, how about making the log lines as simple as the follows?
> >
> > [17156:walsender]  LOG:  acquired replication slot "s1"
> > [17156:walsender]  LOG:  released replication slot "s1"
> 
> Thanks for taking a look at the patch. Here's what I've come up with:
> 
> for drops:
(two log lines per slot: acquire->drop)
> 
> for creates:
(two log lines per slot: create->release)
..

Theses are still needlessly verbose.  Even for those who want slot
activities to be logged are not interested in this detail.  This is
still debug logs in that sense.

> The amount of logs may seem noisy, but they do help a lot given the
> fact that the server generates much more noise, for instance, the
> server logs the syntax error statements. And, the replication slots
> don't get created/dropped every now and then (at max, the
> pg_basebackup if at all used and configured to take the backups, say,
> every 24hrs or so). With the above set of logs debugging for inactive

In a nearby thread, there was a discussion that checkpoint logs are
too noisy to defaultly turn on.  Finally it is committed but I don't
think this is committed as is as it is more verbose (IMV) than the
checkpoint logs. Thus this logs need to be muteable.  Meanwhile I
don't think we willingly add a new knob for this feature. I think we
can piggy-back on log_replication_commands for the purpose, changing
its meaning slightly to "log replication commands and related
activity".

> replication slots becomes easier. One can find the root cause and
> answer questions like "why there was a huge WAL file growth at some
> point or when did a replication slot become inactive or how much time
> a replication slot was inactive or when did a standby disconnected and
> connected again or when did a pg_receivewal or pg_recvlogical
> connected and disconnected so on.".

I don't deny it is useful in that cases.  If you are fine that the
logs are debug-only, making them DEBUG1 would work.  But I don't think
you are fine with that since I think you are going to turn on them on
production systems.

> Here's the v2 patch. Please provide your thoughts.

Thanks!  I have three comments on this version.

- I still think "acquire/release" logs on creation/dropping should be
  silenced.  Adding the third parameter to ReplicationSlotAcquire()
  that can mute the acquiring (and as well as the corresponding
  releasing) log will work.

- Need to mute the logs by log_replication_commands.  (We could add
  another choice for the variable for this purpose but I think we
  don't need it.)

- The messages are not translatable as the variable parts are
  adjectives. They need to consist of static sentences.  The
  combinations of the two properties are 6 (note that persistence is
  tristate) but I don't think we accept that complexity for the
  information.  So I recommend to just remove the variable parts from
  the messages.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Shinya Kato
Дата:
Сообщение: Re: [Proposal] Add foreign-server health checks infrastructure
Следующее
От: Amit Kapila
Дата:
Сообщение: Re: row filtering for logical replication