Re: Crash by targetted recovery

Поиск
Список
Период
Сортировка
От Kyotaro Horiguchi
Тема Re: Crash by targetted recovery
Дата
Msg-id 20200228.121318.27427858650486112.horikyota.ntt@gmail.com
обсуждение исходный текст
Ответ на Re: Crash by targetted recovery  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Ответы Re: Crash by targetted recovery  (Fujii Masao <masao.fujii@oss.nttdata.com>)
Список pgsql-hackers
At Thu, 27 Feb 2020 20:04:41 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in 
> 
> 
> On 2020/02/27 17:05, Kyotaro Horiguchi wrote:
> > Thank you for the comment.
> > At Thu, 27 Feb 2020 16:23:44 +0900, Fujii Masao
> > <masao.fujii@oss.nttdata.com> wrote in
> >> On 2020/02/27 15:23, Kyotaro Horiguchi wrote:
> >>>> I failed to understand why random access while reading from
> >>>> stream is bad idea. Could you elaborate why?
> >>> It seems to me the word "streaming" suggests that WAL record should be
> >>> read sequentially. Random access, which means reading from arbitrary
> >>> location, breaks a stream.  (But the patch doesn't try to stop wal
> >>> sender if randAccess.)
> >>>
> >>>> Isn't it sufficient to set currentSource to 0 when disabling
> >>>> StandbyMode?
> >>> I thought that and it should work, but I hesitated to manipulate on
> >>> currentSource in StartupXLOG. currentSource is basically a private
> >>> state of WaitForWALToBecomeAvailable. ReadRecord modifies it but I
> >>> think it's not good to modify it out of the the logic in
> >>> WaitForWALToBecomeAvailable.
> >>
> >> If so, what about adding the following at the top of
> >> WaitForWALToBecomeAvailable()?
> >>
> >>      if (!StandbyMode && currentSource == XLOG_FROM_STREAM)
> >>           currentSource = 0;
> > It works virtually the same way. I'm happy to do that if you don't
> > agree to using randAccess. But I'd rather do that in the 'if
> > (!InArchiveRecovery)' section.
> 
> The approach using randAccess seems unsafe. Please imagine
> the case where currentSource is changed to XLOG_FROM_ARCHIVE
> because randAccess is true, while walreceiver is still running.
> For example, this case can occur when the record at REDO
> starting point is fetched with randAccess = true after walreceiver
> is invoked to fetch the last checkpoint record. The situation
> "currentSource != XLOG_FROM_STREAM while walreceiver is
>  running" seems invalid. No?

When I mentioned an possibility of changing ReadRecord so that it
modifies randAccess instead of currentSource, I thought that
WaitForWALToBecomeAvailable should shutdown wal receiver as
needed.

At Thu, 27 Feb 2020 15:23:07 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in 
me> location, breaks a stream.  (But the patch doesn't try to stop wal
me> sender if randAccess.)

And random access during StandbyMode ususally (always?) lets RecPtr go
back.  I'm not sure WaitForWALToBecomeAvailable works correctly if we
don't have a file in pg_wal and the REDO point is far back by more
than a segment from the initial checkpoint record.  (It seems to cause
assertion failure, but I haven't checked that.)

If we go back to XLOG_FROM_ARCHIVE by random access, it correctly
re-connects to the primary for the past segment.

> So I think that the approach that I proposed is better.

It depends on how far we assume RecPtr go back.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Langote
Дата:
Сообщение: Re: Autovacuum on partitioned table
Следующее
От: Amit Langote
Дата:
Сообщение: Re: Autovacuum on partitioned table