Re: standby apply lag on inactive servers

Поиск
Список
Период
Сортировка
От Alvaro Herrera
Тема Re: standby apply lag on inactive servers
Дата
Msg-id 20200131144757.GA3354@alvherre.pgsql
обсуждение исходный текст
Ответ на Re: standby apply lag on inactive servers  (Fujii Masao <masao.fujii@oss.nttdata.com>)
Ответы Re: standby apply lag on inactive servers
Список pgsql-hackers
On 2020-Jan-31, Fujii Masao wrote:
> On 2020/01/31 22:40, Alvaro Herrera wrote:
> > On 2020-Jan-31, Fujii Masao wrote:
> > 
> > > You're thinking to apply this change to the back branches? Sorry
> > > if my understanding is not right. But I don't think that back-patch
> > > is ok because it changes the documented existing behavior
> > > of pg_last_xact_replay_timestamp(). So it looks like the behavior
> > > change not a bug fix.
> > 
> > Yeah, I am thinking in backpatching it.  The documented behavior is
> > already not what the code does.
> 
> Maybe you thought this because getRecordTimestamp() extracts the
> timestamp from even WAL record of a restore point? That is, you're
> concerned about that pg_last_xact_replay_timestamp() returns the
> timestamp of not only commit/abort record but also restore point one.
> Right?

right.

> As far as I read the code, this problem doesn't occur because
> SetLatestXTime() is called only for commit/abort records, in
> recoveryStopsAfter(). No?

... uh, wow, you're right about that too.  IMO this is extremely
fragile, easy to break, and under-documented.  But you're right, there's
no bug there at present.

> >  Do you have a situation where this
> > change would break something?  If so, can you please explain what it is?
> 
> For example, use the return value of pg_last_xact_replay_timestamp()
> (and also the timestamp in the log message output at the end of
> recovery) as a HINT when setting recovery_target_time later.

Hmm.

I'm not sure how you would use it in that way.  I mean, I understand how
it *can* be used that way, but it seems too fragile to be done in
practice, in a scenario that's not just laboratory games.

> Use it to compare with the timestamp retrieved from the master server,
> in order to monitor the replication delay.

That's precisely the use case that I'm aiming at.  The timestamp
currently is not useful because this usage breaks when the primary is
inactive (no COMMIT records occur).  During such periods of inactivity,
CHECKPOINT records would keep the "last xtime" current.  This has
actually happened in a production setting, it's not a thought
experiment.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Fujii Masao
Дата:
Сообщение: Re: standby apply lag on inactive servers
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Marking some contrib modules as trusted extensions