Re: detailed error message of pg_waldump

Поиск
Список
Период
Сортировка
От Kyotaro Horiguchi
Тема Re: detailed error message of pg_waldump
Дата
Msg-id 20210616.173558.1189768841998532858.horikyota.ntt@gmail.com
обсуждение исходный текст
Ответ на Re: detailed error message of pg_waldump  (Masahiko Sawada <sawada.mshk@gmail.com>)
Ответы Re: detailed error message of pg_waldump  (Masahiko Sawada <sawada.mshk@gmail.com>)
Список pgsql-hackers
Thanks!

At Wed, 16 Jun 2021 16:52:11 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in 
> On Fri, Jun 4, 2021 at 5:35 PM Kyotaro Horiguchi
> <horikyota.ntt@gmail.com> wrote:
> >
> > In a very common operation of accidentally specifying a recycled
> > segment, pg_waldump often returns the following obscure message.
> >
> > $ pg_waldump 00000001000000000000002D
> > pg_waldump: fatal: could not find a valid record after 0/2D000000
> >
> > The more detailed message is generated internally and we can use it.
> > That looks like the following.
> >
> > $ pg_waldump 00000001000000000000002D
> > pg_waldump: fatal: unexpected pageaddr 0/24000000 in log segment 00000001000000000000002D, offset 0
> >
> > Is it work doing?
> 
> Perhaps we need both? The current message describes where the error
> happened and the message internally generated describes the details.
> It seems to me that both are useful. For example, if we find an error
> during XLogReadRecord(), we show both as follows:
> 
>    if (errormsg)
>        fatal_error("error in WAL record at %X/%X: %s",
>                    LSN_FORMAT_ARGS(xlogreader_state->ReadRecPtr),
>                    errormsg);

Yeah, I thought that it might be a bit vervous and lengty but actually
we have another place where doing that. One more point is whether we
have a case where first_record is invalid but errormsg is NULL
there. WALDumpReadPage immediately exits so we should always have a
message in that case according to the comment in ReadRecord.

> * We only end up here without a message when XLogPageRead()
> * failed - in that case we already logged something. In
> * StandbyMode that only happens if we have been triggered, so we
> * shouldn't loop anymore in that case.

So that can be an assertion.

Now the messages looks like this.

$ pg_waldump /home/horiguti/data/data_work/pg_wal/000000020000000000000010 
pg_waldump: fatal: could not find a valid record after 0/0: unexpected pageaddr 0/9000000 in log segment
000000020000000000000010,offset 0
 

reagards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index 42738eb940..5a98c523c6 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -932,17 +932,28 @@ XLogReaderValidatePageHeader(XLogReaderState *state, XLogRecPtr recptr,
  *
  * This positions the reader, like XLogBeginRead(), so that the next call to
  * XLogReadRecord() will read the next valid record.
+ *
+ * If the page_read callback fails to read the requested data,
+ * InvalidXLogRecPtr is returned.  The callback is expected to have reported
+ * the error; errormsg is set to NULL.
+ *
+ * If the reading fails for some other reason, InvalidXLogRecPtr is also
+ * returned, and *errormsg is set to a string with details of the failure.
+ *
+ * The returned pointer (or *errormsg) points to an internal buffer that's
+ * valid until the next call to XLogFindNextRecord or XLogReadRecord.
  */
 XLogRecPtr
-XLogFindNextRecord(XLogReaderState *state, XLogRecPtr RecPtr)
+XLogFindNextRecord(XLogReaderState *state, XLogRecPtr RecPtr, char **errormsg)
 {
     XLogRecPtr    tmpRecPtr;
     XLogRecPtr    found = InvalidXLogRecPtr;
     XLogPageHeader header;
-    char       *errormsg;
 
     Assert(!XLogRecPtrIsInvalid(RecPtr));
 
+    *errormsg = NULL;
+
     /*
      * skip over potential continuation data, keeping in mind that it may span
      * multiple pages
@@ -1021,7 +1032,7 @@ XLogFindNextRecord(XLogReaderState *state, XLogRecPtr RecPtr)
      * or we just jumped over the remaining data of a continuation.
      */
     XLogBeginRead(state, tmpRecPtr);
-    while (XLogReadRecord(state, &errormsg) != NULL)
+    while (XLogReadRecord(state, errormsg) != NULL)
     {
         /* past the record we've found, break out */
         if (RecPtr <= state->ReadRecPtr)
@@ -1036,6 +1047,9 @@ XLogFindNextRecord(XLogReaderState *state, XLogRecPtr RecPtr)
 err:
     XLogReaderInvalReadState(state);
 
+    if (state->errormsg_buf[0] != '\0')
+        *errormsg = state->errormsg_buf;
+
     return InvalidXLogRecPtr;
 }
 
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index f8b8afe4a7..0dae1ebea7 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -1051,11 +1051,17 @@ main(int argc, char **argv)
         fatal_error("out of memory");
 
     /* first find a valid recptr to start from */
-    first_record = XLogFindNextRecord(xlogreader_state, private.startptr);
+    first_record = XLogFindNextRecord(xlogreader_state, private.startptr,
+                                      &errormsg);
 
     if (first_record == InvalidXLogRecPtr)
-        fatal_error("could not find a valid record after %X/%X",
-                    LSN_FORMAT_ARGS(private.startptr));
+    {
+        /* XLogFindNextRecord must return errormsg on error */
+        Assert (errormsg);
+        fatal_error("could not find a valid record after %X/%X: %s",
+                    LSN_FORMAT_ARGS(xlogreader_state->ReadRecPtr),
+                    errormsg);
+    }
 
     /*
      * Display a message that we're skipping data if `from` wasn't a pointer
diff --git a/src/include/access/xlogreader.h b/src/include/access/xlogreader.h
index 21d200d3df..0f6d920fed 100644
--- a/src/include/access/xlogreader.h
+++ b/src/include/access/xlogreader.h
@@ -267,7 +267,8 @@ extern void XLogReaderFree(XLogReaderState *state);
 /* Position the XLogReader to given record */
 extern void XLogBeginRead(XLogReaderState *state, XLogRecPtr RecPtr);
 #ifdef FRONTEND
-extern XLogRecPtr XLogFindNextRecord(XLogReaderState *state, XLogRecPtr RecPtr);
+extern XLogRecPtr XLogFindNextRecord(XLogReaderState *state, XLogRecPtr RecPtr,
+                                     char **errormsg);
 #endif                            /* FRONTEND */
 
 /* Read the next XLog record. Returns NULL on end-of-WAL or failure */

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andrey Borodin
Дата:
Сообщение: Re: Different compression methods for FPI
Следующее
От: Peter Eisentraut
Дата:
Сообщение: Re: Signed vs. Unsigned (some)