Обсуждение: Logging replication state changes

Поиск
Список
Период
Сортировка

Logging replication state changes

От
SATYANARAYANA NARLAPURAM
Дата:
Hi hackers,

I noticed that below critical replication state changes are DEBUG1 level logged. Any concern about changing the below two messages log level to LOG? If this is too verbose, we can introduce a new GUC, log_replication_state_changes that logs the replication state changes when enabled irrespective of the log level. 

1/ 

/*
 * If we're in catchup state, move to streaming.  This is an
 * important state change for users to know about, since before
 * this point data loss might occur if the primary dies and we
 * need to failover to the standby. The state change is also
 * important for synchronous replication, since commits that
 * started to wait at that point might wait for some time.
 */
if (MyWalSnd->state == WALSNDSTATE_CATCHUP)
{
ereport(DEBUG1,
(errmsg_internal("\"%s\" has now caught up with upstream server",
application_name)));
WalSndSetState(WALSNDSTATE_STREAMING);
}

2/

ereport(DEBUG1,
(errmsg_internal("standby \"%s\" now has synchronous standby priority %u",
application_name, priority)));


Thanks,
Satya

Re: Logging replication state changes

От
Tom Lane
Дата:
SATYANARAYANA NARLAPURAM <satyanarlapuram@gmail.com> writes:
> I noticed that below critical replication state changes are DEBUG1 level
> logged. Any concern about changing the below two messages log level to LOG?

Why?  These seem like perfectly routine messages.

            regards, tom lane



Re: Logging replication state changes

От
SATYANARAYANA NARLAPURAM
Дата:


On Wed, Dec 29, 2021 at 2:04 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
SATYANARAYANA NARLAPURAM <satyanarlapuram@gmail.com> writes:
> I noticed that below critical replication state changes are DEBUG1 level
> logged. Any concern about changing the below two messages log level to LOG?

Why?  These seem like perfectly routine messages.

Consider a scenario where we have a primary and two sync standby (s1 and s2) where s1 is a preferred failover target and s2 is next with synchronous_standby_names = 'First 1 ('s1','s2')'.  In an event, s1 streaming replication is broken and reestablished because of a planned or an unplanned event then s2 participates in the sync commits and makes sure the writes are not stalled on the primary. I would like to know the time window where s1 is not actively acknowledging the commits and the writes are dependent on s2. Also if the service layer decides to failover to s2 instead of s1 because s1 is lagging I need evidence in the log to explain the behavior. 



                        regards, tom lane

Re: Logging replication state changes

От
Amit Kapila
Дата:
On Thu, Dec 30, 2021 at 4:18 AM SATYANARAYANA NARLAPURAM
<satyanarlapuram@gmail.com> wrote:
>
> On Wed, Dec 29, 2021 at 2:04 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>
>> SATYANARAYANA NARLAPURAM <satyanarlapuram@gmail.com> writes:
>> > I noticed that below critical replication state changes are DEBUG1 level
>> > logged. Any concern about changing the below two messages log level to LOG?
>>
>> Why?  These seem like perfectly routine messages.
>
>
> Consider a scenario where we have a primary and two sync standby (s1 and s2) where s1 is a preferred failover target
ands2 is next with synchronous_standby_names = 'First 1 ('s1','s2')'.  In an event, s1 streaming replication is broken
andreestablished because of a planned or an unplanned event then s2 participates in the sync commits and makes sure the
writesare not stalled on the primary. I would like to know the time window where s1 is not actively acknowledging the
commitsand the writes are dependent on s2. Also if the service layer decides to failover to s2 instead of s1 because s1
islagging I need evidence in the log to explain the behavior. 
>

Isn't it better to get this information via pg_stat_replication view
(via state and sync_priority) columns?

--
With Regards,
Amit Kapila.



Re: Logging replication state changes

От
SATYANARAYANA NARLAPURAM
Дата:


On Sat, Jan 8, 2022 at 3:26 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Dec 30, 2021 at 4:18 AM SATYANARAYANA NARLAPURAM
<satyanarlapuram@gmail.com> wrote:
>
> On Wed, Dec 29, 2021 at 2:04 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>
>> SATYANARAYANA NARLAPURAM <satyanarlapuram@gmail.com> writes:
>> > I noticed that below critical replication state changes are DEBUG1 level
>> > logged. Any concern about changing the below two messages log level to LOG?
>>
>> Why?  These seem like perfectly routine messages.
>
>
> Consider a scenario where we have a primary and two sync standby (s1 and s2) where s1 is a preferred failover target and s2 is next with synchronous_standby_names = 'First 1 ('s1','s2')'.  In an event, s1 streaming replication is broken and reestablished because of a planned or an unplanned event then s2 participates in the sync commits and makes sure the writes are not stalled on the primary. I would like to know the time window where s1 is not actively acknowledging the commits and the writes are dependent on s2. Also if the service layer decides to failover to s2 instead of s1 because s1 is lagging I need evidence in the log to explain the behavior.
>

Isn't it better to get this information via pg_stat_replication view
(via state and sync_priority) columns?

We need the historical information to analyze and root cause in addition to the live debugging. It would be good to have better control over replication messages.
 

--
With Regards,
Amit Kapila.

Re: Logging replication state changes

От
"Euler Taveira"
Дата:
On Sat, Jan 8, 2022, at 6:29 PM, SATYANARAYANA NARLAPURAM wrote:
We need the historical information to analyze and root cause in addition to the live debugging. It would be good to have better control over replication messages. 
I think the answer to this demand is not to change the message level but to
implement a per-module log_min_messages. This idea is in the TODO [1] for more
than a decade. Check the archives.

I agree with Tom that the referred messages are noisy, hence, DEBUG1 is fine
for it.



--
Euler Taveira