Re: [Bug fix]There is the case archive_timeout parameter is ignored after recovery works.

Поиск
Список
Период
Сортировка
От Fujii Masao
Тема Re: [Bug fix]There is the case archive_timeout parameter is ignored after recovery works.
Дата
Msg-id 40f5df69-99a2-5f7c-5432-6f85fa15e9c5@oss.nttdata.com
обсуждение исходный текст
Ответ на Re: [Bug fix]There is the case archive_timeout parameter is ignored after recovery works.  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Ответы RE: [Bug fix]There is the case archive_timeout parameter is ignored after recovery works.  ("higuchi.daisuke@fujitsu.com" <higuchi.daisuke@fujitsu.com>)
Список pgsql-hackers

On 2020/06/30 9:14, Kyotaro Horiguchi wrote:
> Opps! I misunderstood that.
> 
> At Mon, 29 Jun 2020 13:00:25 +0000, "higuchi.daisuke@fujitsu.com" <higuchi.daisuke@fujitsu.com> wrote in
>> Fujii-san, thank you for comments.
>>
>>> The cause of this problem is that the checkpointer's sleep time is calculated
>> >from both checkpoint_timeout and archive_timeout during normal running,
>>> but calculated only from checkpoint_timeout during recovery. So Daisuke-san's
>>> patch tries to change that so that it's calculated from both of them even
>>> during recovery. No?
>>
>> Yes, it's exactly so.
>>
>>> last_xlog_switch_time is not updated during recovery. So "elapsed_secs" can be
>>> large and cur_timeout can be negative. Isn't this problematic?
>>
>> Yes... My patch was missing this.
> 
> The patch also makes WaitLatch called with zero timeout, which causes
> assertion failure.
> 
>> How about using the original archive_timeout value for calculating cur_timeout during recovery?
>>
>>                  if (XLogArchiveTimeout > 0 && !RecoveryInProgress())
>>                  {
>>                          elapsed_secs = now - last_xlog_switch_time;
>>                          if (elapsed_secs >= XLogArchiveTimeout)
>>                                  continue;               /* no sleep for us ... */
>>                          cur_timeout = Min(cur_timeout, XLogArchiveTimeout - elapsed_secs);
>>                  }
>> +               else if (XLogArchiveTimeout > 0)
>> +                       cur_timeout = Min(cur_timeout, XLogArchiveTimeout);
>>
>> During recovery, accurate cur_timeout is not calculated because elapsed_secs is not used.

Yes, that's an idea. But I'm a bit concerned about that this change makes
checkpointer wake up more frequently than necessary during recovery.
Which may increase the idle power consumption of checkpointer during
recovery. Of course, this would not be so problematic in practice because
we can expect that archive_timeout is not so small. But it seems better to
avoid unncessary wake-ups if we can easily do that.

>> However, after recovery is complete, WAL archiving will start by the next archive_timeout is reached.
>> I felt it is enough to solve this problem.
> 
> That causes unwanted change of cur_timeout during recovery.
> 
>>> As another approach, what about waking the checkpointer up at the end of
>>> recovery like we already do for walsenders?
> 
> We don't want change checkpoint interval during recovery, that means
> we cannot cnosider archive_timeout at the fist checkpointer after
> recovery ends. So I think that the suggestion from Fujii-san is the
> direction.

+1
If this idea has some problems, we can revisit Daisuke-san's idea.

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Kyotaro Horiguchi
Дата:
Сообщение: Re: [Bug fix]There is the case archive_timeout parameter is ignored after recovery works.
Следующее
От: Masahiko Sawada
Дата:
Сообщение: Re: Resetting spilled txn statistics in pg_stat_replication