Обсуждение: Miscalculation in IsCheckpointOnSchedule()
I run long DBT-2 with 8.3beta2 and saw checkpoint spikes periodically.
The progress against WAL segments consumption *jumped up* in such cases.
It seems to be a miscalculation in IsCheckpointOnSchedule().
xrecoff is uint32, therefore the difference of two xrecoffs could
be between -4G and +4G. We should not cast it to int32, that domain
is [-2G, +2G).
Here is a patch to fix it, casting xrecoff to double directly.
Index: src/backend/postmaster/bgwriter.c
===================================================================
--- src/backend/postmaster/bgwriter.c (HEAD)
+++ src/backend/postmaster/bgwriter.c (working copy)
@@ -718,7 +718,7 @@
recptr = GetInsertRecPtr();
elapsed_xlogs =
(((double) (int32) (recptr.xlogid - ckpt_start_recptr.xlogid)) * XLogSegsPerFile +
- ((double) (int32) (recptr.xrecoff - ckpt_start_recptr.xrecoff)) / XLogSegSize) /
+ ((double) recptr.xrecoff - (double) ckpt_start_recptr.xrecoff) / XLogSegSize) /
CheckPointSegments;
if (progress < elapsed_xlogs)
Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center
ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes:
> - ((double) (int32) (recptr.xrecoff - ckpt_start_recptr.xrecoff)) / XLogSegSize) /
> + ((double) recptr.xrecoff - (double) ckpt_start_recptr.xrecoff) / XLogSegSize) /
Surely this makes matters worse, not better. What happens near a segment
boundary crossing?
regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes:
> > - ((double) (int32) (recptr.xrecoff - ckpt_start_recptr.xrecoff)) / XLogSegSize) /
> > + ((double) recptr.xrecoff - (double) ckpt_start_recptr.xrecoff) / XLogSegSize) /
>
> Surely this makes matters worse, not better. What happens near a segment
> boundary crossing?
Here is the dumped progres information by the attached patch
(only for debug purpose).
cur prog. xlog prog. time prog.
[-400ms] 0.030503 0.019247 0.031158 (diff xlogid=0, xrecoff=82665472)
[-200ms] 0.031176 0.019957 0.031839 (diff xlogid=0, xrecoff=85712896)
[*] 0.031860 1.020706 0.032521 (diff xlogid=1, xrecoff=105709568)
> recptr.xrecoff - ckpt_start_recptr.xrecoff
recptr.xrecoff is reset to 0 or so when xlogid is bumped up. At that time,
if ckpt_start_recptr.xrecoff is greater than 2G, we cannot represent
the difference with int32 because the value is less than -2G.
Casting double after int32 does not help us.
We use ( xlogid * 255 * 16MB + xrecoff ) as a base value for the calculation.
If we interprets xrecoff=105709568 at [*] as "minus uint32", we can calcurate
it correctly as below:
without fix with fix
[-400ms] 82665472 82665472 (same)
[-200ms] 85712896 85712896 (same)
[*] 4383899648 88932353 (= 1*255*16MB - (0xFFFFFFFF - 105709568) )
Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center
Вложения
Tom Lane wrote: > ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes: >> - ((double) (int32) (recptr.xrecoff - ckpt_start_recptr.xrecoff)) / XLogSegSize) / >> + ((double) recptr.xrecoff - (double) ckpt_start_recptr.xrecoff) / XLogSegSize) / > > Surely this makes matters worse, not better. What happens near a segment > boundary crossing? Hmm. There seems to be another little bug in there. XLogSegsPerFile is defined as 0xffffffff/XLogSegSize, which is 255 with default settings. It should be 256. That leads to negative elapsed_xlogs estimates at xlog file boundaries. XLogCheckpointNeeded suffers from it too; the number of segments consumed since last checkpoint is off by one per each xlogid, so we trigger the checkpoint a little bit too late. Otherwise, the patch looks good to me. I also tested the calculation with a little C program (attached), and it seems to work on segment and xlog file boundaries just fine. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com #include <stdio.h> #include <stdlib.h> typedef unsigned char uint8; /* == 8 bits */ typedef unsigned short uint16; /* == 16 bits */ typedef unsigned int uint32; /* == 32 bits */ typedef signed char int8; /* == 8 bits */ typedef signed short int16; /* == 16 bits */ typedef signed int int32; /* == 32 bits */ #define XLOG_SEG_SIZE (16*1024*1024) #define XLogSegSize ((uint32) XLOG_SEG_SIZE) #define XLogSegsPerFile (((uint32) 0xffffffff) / XLogSegSize) int main(int argc, char **argv) { double elapsed_xlogs; uint32 cur_xrecoff, cur_xlogid; uint32 ckpt_xrecoff, ckpt_xlogid; double a, b; cur_xlogid = strtoul(argv[1], NULL, 10); cur_xrecoff = strtoul(argv[2], NULL, 10); ckpt_xlogid = strtoul(argv[3], NULL, 10); ckpt_xrecoff = strtoul(argv[4], NULL, 10); //a = (cur_xlogid - ckpt_xlogid) * XLogSegsPerFile; //b = (((double) cur_xrecoff) - ((double) ckpt_xrecoff)) / ((double )XLogSegSize); //a = ((double) (int32) (cur_xlogid - ckpt_xlogid)) * XLogSegsPerFile; //b = ((double) (int32) (cur_xrecoff - ckpt_xrecoff)) / XLogSegSize; a = (cur_xlogid - ckpt_xlogid) * XLogSegsPerFile; b = ((double) cur_xrecoff - (double) ckpt_xrecoff) / XLogSegSize; elapsed_xlogs = a + b; printf("XLogSegsPerFile: %d\n", XLogSegsPerFile); printf("xrecoff: %u %f\n", ckpt_xrecoff, (double) ckpt_xrecoff); printf("a: %f\n", a); printf("b: %f\n", b); printf("elapsed_xlogs = %f\n", elapsed_xlogs); }
Heikki Linnakangas wrote: > Tom Lane wrote: >> ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes: >>> - ((double) (int32) (recptr.xrecoff - >>> ckpt_start_recptr.xrecoff)) / XLogSegSize) / >>> + ((double) recptr.xrecoff - (double) >>> ckpt_start_recptr.xrecoff) / XLogSegSize) / >> >> Surely this makes matters worse, not better. What happens near a segment >> boundary crossing? > > Hmm. There seems to be another little bug in there. XLogSegsPerFile is > defined as 0xffffffff/XLogSegSize, which is 255 with default settings. > It should be 256. That leads to negative elapsed_xlogs estimates at xlog > file boundaries. XLogCheckpointNeeded suffers from it too; the number of > segments consumed since last checkpoint is off by one per each xlogid, > so we trigger the checkpoint a little bit too late. I'll take that back. We intentionally don't use the last possible segment of each xlog file, to avoid overflows. Sorry for the noise. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Surely this makes matters worse, not better. What happens near a segment
>> boundary crossing?
> Here is the dumped progres information by the attached patch
> (only for debug purpose).
Oh, I take that back. I was thinking that conversion of unsigned to
double usually does the Wrong Thing, but in this context it seems to be
the right thing.
regards, tom lane