Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

Поиск
Список
Период
Сортировка
От Kyotaro Horiguchi
Тема Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures
Дата
Msg-id 20220927.120125.579639936942345624.horikyota.ntt@gmail.com
обсуждение исходный текст
Ответ на Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Ответы Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Список pgsql-hackers
At Tue, 20 Sep 2022 17:40:36 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in 
> On Tue, Sep 20, 2022 at 12:57 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> >
> > On 2022-Sep-19, Bharath Rupireddy wrote:
> >
> > > We have a bunch of messages [1] that have an offset, but not LSN in
> > > the error message. Firstly, is there an easiest way to figure out LSN
> > > from offset reported in the error messages? If not, is adding LSN to
> > > these messages along with offset a good idea? Of course, we can't just
> > > convert offset to LSN using XLogSegNoOffsetToRecPtr() and report, but
> > > something meaningful like reporting the LSN of the page that we are
> > > reading-in or writing-out etc.
> >
> > Maybe add errcontext() somewhere that reports the LSN would be
> > appropriate.  For example, the page_read() callbacks have the LSN
> > readily available, so the ones in backend could install the errcontext
> > callback; or perhaps ReadPageInternal can do it #ifndef FRONTEND.  Not
> > sure what is best of those options, but either of those sounds better
> > than sticking the LSN in a lower-level routine that doesn't necessarily
> > have the info already.
> 
> All of the error messages [1] have the LSN from which offset was
> calculated, I think we can just append that to the error messages
> (something like ".... offset %u, LSN %X/%X: %m") and not complicate
> it. Thoughts?

If all error-emitting site knows the LSN, we don't need the context
message. But *I* would like that the additional message looks like
"while reading record at LSN %X/%X" or slightly shorter version of
it. Because the targetRecPtr is the beginning of the current reading
record, not the LSN for the segment and offset. It may point to past
segments.


> [1]
> errmsg("could not read from WAL segment %s, offset %u: %m",
> errmsg("could not read from WAL segment %s, offset %u: %m",
> errmsg("could not write to log file %s "
>        "at offset %u, length %zu: %m",
> errmsg("unexpected timeline ID %u in WAL segment %s, offset %u",
> errmsg("could not read from WAL segment %s, offset %u: read %d of %zu",
> pg_log_error("received write-ahead log record for offset %u with no file open",
> "invalid magic number %04X in WAL segment %s, offset %u",
> "invalid info bits %04X in WAL segment %s, offset %u",
> "invalid info bits %04X in WAL segment %s, offset %u",
> "unexpected pageaddr %X/%X in WAL segment %s, offset %u",
> "out-of-sequence timeline ID %u (after %u) in WAL segment %s, offset %u",


regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



В списке pgsql-hackers по дате отправления:

Предыдущее
От: James Coleman
Дата:
Сообщение: Re: cirrus-ci cross-build interactions?
Следующее
От: Andres Freund
Дата:
Сообщение: Re: Add hint about downloadable logs to CI README