Re: corruption of WAL page header is never reported

Поиск
Список
Период
Сортировка
От Kyotaro Horiguchi
Тема Re: corruption of WAL page header is never reported
Дата
Msg-id 20210902.131716.1067877126401208860.horikyota.ntt@gmail.com
обсуждение исходный текст
Ответ на Re: corruption of WAL page header is never reported  (Fujii Masao <masao.fujii@oss.nttdata.com>)
Ответы Re: corruption of WAL page header is never reported  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Re: corruption of WAL page header is never reported  (Fujii Masao <masao.fujii@oss.nttdata.com>)
Список pgsql-hackers
At Thu, 2 Sep 2021 12:19:25 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in 
> 
> 
> On 2021/07/19 18:52, Yugo NAGATA wrote:
> > Well, I think my first patch was not wrong. The difference with the
> > latest
> > patch is just whether we perform the additional check when we are not
> > in
> > standby mode or not. The behavior is basically the same although which
> > function
> > detects and prints the page-header error in cases of crash recovery is
> > different.
> 
> Yes, so which patch do you think is better? I like your version
> because there seems no reason why XLogPageRead() should skip
> XLogReaderValidatePageHeader() when not in standby mode.

Did you read the comment just above?

xlog.c:12523
>     * Check the page header immediately, so that we can retry immediately if
>     * it's not valid. This may seem unnecessary, because XLogReadRecord()
>     * validates the page header anyway, and would propagate the failure up to
>     * ReadRecord(), which would retry. However, there's a corner case with
>     * continuation records, if a record is split across two pages such that

So when not in standby mode, the same check is performed by xlogreader
which has the responsibility to validate the binary data read by
XLogPageRead. The page-header validation is a compromise to save a
specific case.

> Also I'm tempted to move ereport() and reset of errmsg_buf to
> under next_record_is_invalid as follows. That is, in standby mode
> whenever we find an invalid record and retry reading WAL page
> in XLogPageRead(), we report the error message and reset it.
> For now in XLogPageRead(), only XLogReaderValidatePageHeader()
> sets errmsg_buf, but in the future other code or function doing that
> may be added. For that case, the following change seems more elegant.
> Thought?

I don't think it is good choice to conflate read-failure and header
validation failure from the view of modularity.

>      * shouldn't be a big deal from a performance point of view.
>       */
>      if (!XLogReaderValidatePageHeader(xlogreader, targetPagePtr, readBuf))
> -    {
> -        /* reset any error XLogReaderValidatePageHeader() might have set */
> -        xlogreader->errormsg_buf[0] = '\0';
>          goto next_record_is_invalid;
> -    }
>        return readLen;
>  @@ -12517,7 +12513,17 @@ next_record_is_invalid:
>        /* In standby-mode, keep trying */
>      if (StandbyMode)
> +    {
> +        if (xlogreader->errormsg_buf[0])
> +        {
> +            ereport(emode_for_corrupt_record(emode, EndRecPtr),
> + (errmsg_internal("%s", xlogreader->errormsg_buf)));
> +
> + /* reset any error XLogReaderValidatePageHeader() might have set */
> +            xlogreader->errormsg_buf[0] = '\0';
> +        }
>          goto retry;
> +    }
>      else
>          return -1;
>  }

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: row filtering for logical replication
Следующее
От: Peter Smith
Дата:
Сообщение: Re: row filtering for logical replication