Re: Wrong statistics for size of XLOG_SWITCH during pg_waldump.

Поиск
Список
Период
Сортировка
От Kyotaro Horiguchi
Тема Re: Wrong statistics for size of XLOG_SWITCH during pg_waldump.
Дата
Msg-id 20201015.173210.1595235318844221199.horikyota.ntt@gmail.com
обсуждение исходный текст
Ответ на Wrong statistics for size of XLOG_SWITCH during pg_waldump.  ("movead.li@highgo.ca" <movead.li@highgo.ca>)
Ответы Re: Wrong statistics for size of XLOG_SWITCH during pg_waldump.  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Список pgsql-hackers
At Thu, 15 Oct 2020 12:56:02 +0800, "movead.li@highgo.ca" <movead.li@highgo.ca> wrote in 
> Thanks for all the suggestions.
> 
> >Yeah.  In its current shape, it means that only pg_waldump would be
> >able to know this information.  If you make this information part of
> >xlogdesc.c, any consumer of the WAL record descriptions would be able
> >to show this information, so it would provide a consistent output for
> >any kind of tools.
> I have change the implement, move some code into xlog_desc().

Andres suggested that we don't need that description with per-record
basis. Do you have a reason to do that?  (For clarity, I'm not
suggesting that you should reving it.)

> >On top of that, it seems to me that the calculation used in the patch
> >is wrong in two aspects at quick glance:
> >1) startSegNo and endSegNo point always to two different segments with
> >a XLOG_SWITCH record, so you should check that ReadRecPtr is not at a
> >segment border instead before extracting SizeOfXLogLongPHD, no?
> Yes you are right, and I use 'record->EndRecPtr - 1' instead.

+    XLByteToSeg(record->EndRecPtr - 1, endSegNo, record->segcxt.ws_segsize);

We use XLByteToPrevSeg instead for this purpose.

> >2) This stuff should also check after the case of a WAL *page* border
> >where you'd need to adjust based on SizeOfXLogShortPHD instead.
> The 'junk_len -= SizeOfXLogLongPHD' code is considered for when the
> remain size of a wal segment can not afford a XLogRecord struct for
> XLOG_SWITCH, it needn't care *page* border.
> 
> >I'm not sure the exact motive of this proposal, but if we show the
> >wasted length in the stats result, I think it should be other than
> >existing record types.
> Yes agree, and now it looks like below as new patch:

You forgot to add a correction for short headers.

> movead@movead-PC:/h2/pg/bin$ ./pg_waldump -p ../walbk/ -s 0/3000000 -e 0/6000000 -z
> Type                                           N      (%)          Record size      (%)             FPI size      (%)
      Combined size      (%)
 
> ----                                           -      ---          -----------      ---             --------      ---
      -------------      ---
 
> XLOG                                           5 ( 31.25)                  300 (  0.00)                    0 (  0.00)
                300 (  0.00)
 
> XLOGSwitchJunk                                 3 ( 18.75)             50330512 (100.00)                    0 (  0.00)
           50330512 (100.00)
 
> 
> 
> movead@movead-PC:/h2/pg/bin$ ./pg_waldump -p ../walbk/ -s 0/3000000 -e 0/6000000 --stat=record
> XLOG/SWITCH                                    3 ( 18.75)                   72 (  0.00)                    0 (  0.00)
                 72 (  0.00)
 
> XLOG/SWITCH_JUNK                               3 ( 18.75)             50330512 (100.00)                    0 (  0.00)
           50330512 (100.00)
 
> 
> The shortcoming now is I do not know how to handle the 'count' of SWITCH_JUNK
> in pg_waldump results. Currently I regard SWITCH_JUNK as one count.


+    if(RM_XLOG_ID == rmid && XLOG_SWITCH == info)

We need a comment for the special code path.
We don't follow this kind of convension. Rather use "variable =
constant".

+    {
+        junk_len = xlog_switch_junk_len(record);
+        stats->count_xlog_switch++;
+        stats->junk_size += junk_len;

junk_len is used only the last line above.  We don't need that
temporary variable.

+     * If the wal switch record spread on two segments, we should extra minus the

This comment is sticking out of 80-column border.  However, I'm not
sure if we have reached a conclustion about the target console-width.

+                info = (rj << 4) & ~XLR_INFO_MASK;
+                switch_junk_id = "XLOG/SWITCH_JUNK";
+                if( XLOG_SWITCH == info && stats->count_xlog_switch > 0)

This line order is strange. At least switch_junk_id is used only in
the if-then block.

I'm not confindent on which is better, but I feel that this is not a
work for display side, but for the recorder side like attached.

> >By the way, I noticed that --stats=record shows two lines for
> >Transaction/COMMIT. The cause is that XLogDumpCountRecord assumes the
> >all four bits of xl_info is used to identify record id.
> Thanks,I didn't notice it before, and your patch added into v3 patch attached.

Sorry for the confusion, but it would be a separate topic even if we
are going to fix that..

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index 3200f777f5..04042a50a4 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -142,6 +142,31 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
     }
 }
 
+uint32
+xlog_switch_junk_len(XLogReaderState *record)
+{
+    uint32         junk_len;
+    XLogSegNo    startSegNo;
+    XLogSegNo    endSegNo;
+
+    XLByteToSeg(record->ReadRecPtr, startSegNo, record->segcxt.ws_segsize);
+    XLByteToSeg(record->EndRecPtr - 1, endSegNo, record->segcxt.ws_segsize);
+
+    junk_len = record->EndRecPtr - record->ReadRecPtr - XLogRecGetTotalLen(record);
+    /*
+     * If the wal switch record spread on two segments, we should extra minus the
+     * long page head. I mean the case when the remain size of a wal segment can not
+     * afford a XLogRecord struct for XLOG_SWITCH.
+     */
+    if(startSegNo != endSegNo)
+        junk_len -= SizeOfXLogLongPHD;
+
+
+    Assert(junk_len >= 0);
+
+    return junk_len;
+}
+
 const char *
 xlog_identify(uint8 info)
 {
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 31e99c2a6d..ab4e7c830f 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -24,6 +24,7 @@
 #include "common/logging.h"
 #include "getopt_long.h"
 #include "rmgrdesc.h"
+#include "catalog/pg_control.h"
 
 static const char *progname;
 
@@ -66,8 +67,8 @@ typedef struct Stats
 typedef struct XLogDumpStats
 {
     uint64        count;
-    Stats        rmgr_stats[RM_NEXT_ID];
-    Stats        record_stats[RM_NEXT_ID][MAX_XLINFO_TYPES];
+    Stats        rmgr_stats[RM_NEXT_ID + 1];
+    Stats        record_stats[RM_NEXT_ID + 1][MAX_XLINFO_TYPES];
 } XLogDumpStats;
 
 #define fatal_error(...) do { pg_log_fatal(__VA_ARGS__); exit(EXIT_FAILURE); } while(0)
@@ -441,6 +442,22 @@ XLogDumpCountRecord(XLogDumpConfig *config, XLogDumpStats *stats,
     stats->record_stats[rmid][recid].count++;
     stats->record_stats[rmid][recid].rec_len += rec_len;
     stats->record_stats[rmid][recid].fpi_len += fpi_len;
+
+
+    /* Add junk-space stats for XLOG_SWITCH  */
+    if(rmid == RM_XLOG_ID && XLogRecGetInfo(record) == XLOG_SWITCH)
+    {
+        uint64 junk_len = xlog_switch_junk_len(record);
+
+        rmid = WALDUMP_RM_ID;
+        recid = 0;
+
+        stats->rmgr_stats[rmid].count++;
+        stats->rmgr_stats[rmid].rec_len += junk_len;
+
+        stats->record_stats[rmid][recid].count++;
+        stats->record_stats[rmid][recid].rec_len += junk_len;
+    }
 }
 
 /*
@@ -616,7 +633,7 @@ XLogDumpDisplayStats(XLogDumpConfig *config, XLogDumpStats *stats)
      * calculate column totals.
      */
 
-    for (ri = 0; ri < RM_NEXT_ID; ri++)
+    for (ri = 0; ri <= RM_NEXT_ID; ri++)
     {
         total_count += stats->rmgr_stats[ri].count;
         total_rec_len += stats->rmgr_stats[ri].rec_len;
@@ -634,7 +651,7 @@ XLogDumpDisplayStats(XLogDumpConfig *config, XLogDumpStats *stats)
            "Type", "N", "(%)", "Record size", "(%)", "FPI size", "(%)", "Combined size", "(%)",
            "----", "-", "---", "-----------", "---", "--------", "---", "-------------", "---");
 
-    for (ri = 0; ri < RM_NEXT_ID; ri++)
+    for (ri = 0; ri <= RM_NEXT_ID; ri++)
     {
         uint64        count,
                     rec_len,
diff --git a/src/bin/pg_waldump/rmgrdesc.c b/src/bin/pg_waldump/rmgrdesc.c
index 852d8ca4b1..efddc70ac1 100644
--- a/src/bin/pg_waldump/rmgrdesc.c
+++ b/src/bin/pg_waldump/rmgrdesc.c
@@ -35,6 +35,18 @@
 #define PG_RMGR(symname,name,redo,desc,identify,startup,cleanup,mask) \
     { name, desc, identify},
 
-const RmgrDescData RmgrDescTable[RM_MAX_ID + 1] = {
+static const char *waldump_xlogswitch_identify(uint8 info);
+
+const RmgrDescData RmgrDescTable[RM_MAX_ID + 2] = {
 #include "access/rmgrlist.h"
+    PG_RMGR(WALDUMP_RM_ID, "OTHERS", NULL, NULL, waldump_xlogswitch_identify, NULL, NULL, NULL)
 };
+
+static const char *
+waldump_xlogswitch_identify(uint8 info)
+{
+    if (info == 0)
+        return "XLOG_SWITCH_JUNK";
+    else
+        return NULL;
+}
diff --git a/src/bin/pg_waldump/rmgrdesc.h b/src/bin/pg_waldump/rmgrdesc.h
index 42f8483b48..2a67d1050e 100644
--- a/src/bin/pg_waldump/rmgrdesc.h
+++ b/src/bin/pg_waldump/rmgrdesc.h
@@ -20,4 +20,6 @@ typedef struct RmgrDescData
 
 extern const RmgrDescData RmgrDescTable[];
 
+#define WALDUMP_RM_ID RM_NEXT_ID
+
 #endif                            /* RMGRDESC_H */
diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h
index 4146753d47..8cbc309108 100644
--- a/src/include/access/xlog_internal.h
+++ b/src/include/access/xlog_internal.h
@@ -315,6 +315,7 @@ extern XLogRecPtr RequestXLogSwitch(bool mark_unimportant);
 
 extern void GetOldestRestartPoint(XLogRecPtr *oldrecptr, TimeLineID *oldtli);
 
+extern uint32 xlog_switch_junk_len(XLogReaderState *record);
 /*
  * Exported for the functions in timeline.c and xlogarchive.c.  Only valid
  * in the startup process.

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: setLastTid() and currtid()
Следующее
От: Andres Freund
Дата:
Сообщение: Re: gs_group_1 crashing on 13beta2/s390x