Обсуждение: WAL & RC1 status

Поиск
Список
Период
Сортировка

WAL & RC1 status

От
Tom Lane
Дата:
I am *not* feeling good about pushing out an RC1 release candidate
today.

I've been going through the WAL code, trying to understand it and
document it.  I've found a number of minor problems and several major
ones ("major" meaning "can't really fix without an incompatible file
format change, hence initdb").  I've reported the major problems to
the mailing lists but gotten almost no feedback about what to do.

In addition, I'm still looking for the bug that I originally went in to
find: Scott Parish's report of being unable to restart after a normal
shutdown of beta4.  Examination of his WAL log shows some pretty serious
lossage (see attached dump).  My current theory is that the
buffer-slinging logic in xlog.c dropped one or more whole buffers' worth
of log records, but I haven't figured out exactly how.

I want to veto putting out an RC1 until these issues are resolved...
comments?
        regards, tom lane


...
0/00599890: prv 0/00599854; xprv 0/00599854; xid 18871; RM 10 info 00 len 65
0/005998F4: prv 0/00599890; xprv 0/00599890; xid 18871; RM 11 info 90 len 50
0/00599948: prv 0/005998F4; xprv 0/005998F4; xid 18871; RM  1 info 00 len 4
commit: 2001-02-26 17:19:57
0/0059996C: prv 0/00599948; xprv 0/00000000; xid 0; RM  0 info 00 len 32
checkpoint: redo 0/0059996C; undo 0/00000000; sui 29; nextxid 18903; nextoid 35195; online
-- this is the last normal-looking checkpoint record.  Judging from the
-- commit timestamps surrounding prior checkpoints, checkpoints were
-- happening every five minutes approximately on the 5-minute mark, so
-- this one happened about 17:20.  (There really should be a timestamp
-- in the checkpoint records...)
0/005999AC: prv 0/0059996C; xprv 0/00000000; xid 18923; RM 10 info 08 len 8226; bkpb 1
0/0059B9FC: prv 0/005999AC; xprv 0/005999AC; xid 18923; RM 11 info 98 len 8226; bkpb 1
0/0059DA4C: prv 0/0059B9FC; xprv 0/0059B9FC; xid 18923; RM 10 info 00 len 72
0/0059DAB4: prv 0/0059DA4C; xprv 0/0059DA4C; xid 18923; RM 11 info 90 len 26
0/0059DAF0: prv 0/0059DAB4; xprv 0/0059DAB4; xid 18923; RM 10 info 00 len 72
0/0059DB58: prv 0/0059DAF0; xprv 0/0059DAF0; xid 18923; RM 11 info 90 len 26
0/0059DB94: prv 0/0059DB58; xprv 0/0059DB58; xid 18923; RM 10 info 00 len 72
0/0059DBFC: prv 0/0059DB94; xprv 0/0059DB94; xid 18923; RM 11 info 90 len 26
0/0059DC38: prv 0/0059DBFC; xprv 0/0059DBFC; xid 18923; RM 10 info 08 len 8226; bkpb 1
0/0059FC88: prv 0/0059DC38; xprv 0/0059DC38; xid 18923; RM 11 info 98 len 8226; bkpb 1
0/005A1CD8: prv 0/0059FC88; xprv 0/0059FC88; xid 18923; RM  1 info 00 len 4
commit: 2001-02-26 17:21:10
0/005A1CFC: prv 0/005A1CD8; xprv 0/00000000; xid 18951; RM 15 info 00 len 100
0/005A1D80: prv 0/005A1CFC; xprv 0/005A1CFC; xid 18951; RM 10 info 00 len 72
0/005A1DE8: prv 0/005A1D80; xprv 0/005A1D80; xid 18951; RM 11 info 90 len 26
0/005A1E24: prv 0/005A1DE8; xprv 0/005A1DE8; xid 18951; RM 10 info 00 len 72
0/005A1E8C: prv 0/005A1E24; xprv 0/005A1E24; xid 18951; RM 11 info 90 len 26
0/005A1EC8: prv 0/005A1E8C; xprv 0/005A1E8C; xid 18951; RM 10 info 00 len 72
0/005A1F30: prv 0/005A1EC8; xprv 0/005A1EC8; xid 18951; RM 11 info 90 len 26
0/005A1F6C: prv 0/005A1F30; xprv 0/005A1F30; xid 18951; RM 10 info 00 len 72
0/005A1FD4: prv 0/005A1F6C; xprv 0/005A1F6C; xid 18951; RM 11 info 90 len 26
0/005A201C: prv 0/005A1FD4; xprv 0/005A1FD4; xid 18951; RM 10 info 00 len 65
0/005A2080: prv 0/005A201C; xprv 0/005A201C; xid 18951; RM 11 info 98 len 8226; bkpb 1
0/005A40D0: prv 0/005A2080; xprv 0/005A2080; xid 18951; RM  1 info 00 len 4
commit: 2001-02-26 17:21:33
0/005A40F4: prv 0/005A40D0; xprv 0/00000000; xid 18986; RM 10 info 00 len 72
0/005A415C: prv 0/005A40F4; xprv 0/005A40F4; xid 18986; RM 11 info 90 len 26
0/005A4198: prv 0/005A415C; xprv 0/005A415C; xid 18986; RM 10 info 00 len 72
0/005A4200: prv 0/005A4198; xprv 0/005A4198; xid 18986; RM 11 info 90 len 26
0/005A423C: prv 0/005A4200; xprv 0/005A4200; xid 18986; RM 10 info 00 len 72
0/005A42A4: prv 0/005A423C; xprv 0/005A423C; xid 18986; RM 11 info 90 len 26
0/005A42E0: prv 0/005A42A4; xprv 0/005A42A4; xid 18986; RM 10 info 00 len 72
0/005A4348: prv 0/005A42E0; xprv 0/005A42E0; xid 18986; RM 11 info 90 len 26
0/005A4384: prv 0/005A4348; xprv 0/005A4348; xid 18986; RM 10 info 00 len 65
0/005A43E8: prv 0/005A4384; xprv 0/005A4384; xid 18986; RM 11 info 90 len 50
0/005A443C: prv 0/005A43E8; xprv 0/005A43E8; xid 18986; RM  1 info 00 len 4
commit: 2001-02-26 17:22:20
0/005A4460: prv 0/005A443C; xprv 0/00000000; xid 19020; RM 10 info 00 len 72
0/005A44C8: prv 0/005A4460; xprv 0/005A4460; xid 19020; RM 11 info 90 len 26
0/005A4504: prv 0/005A44C8; xprv 0/005A44C8; xid 19020; RM 10 info 00 len 72
0/005A456C: prv 0/005A4504; xprv 0/005A4504; xid 19020; RM 11 info 90 len 26
0/005A45A8: prv 0/005A456C; xprv 0/005A456C; xid 19020; RM 10 info 00 len 72
0/005A4610: prv 0/005A45A8; xprv 0/005A45A8; xid 19020; RM 11 info 90 len 26
0/005A464C: prv 0/005A4610; xprv 0/005A4610; xid 19020; RM 10 info 00 len 72
0/005A46B4: prv 0/005A464C; xprv 0/005A464C; xid 19020; RM 11 info 90 len 26
0/005A46F0: prv 0/005A46B4; xprv 0/005A46B4; xid 19020; RM 10 info 00 len 65
0/005A4754: prv 0/005A46F0; xprv 0/005A46F0; xid 19020; RM 11 info 90 len 50
0/005A47A8: prv 0/005A4754; xprv 0/005A4754; xid 19020; RM  1 info 00 len 4
commit: 2001-02-26 17:24:34
0/005A47CC: prv 0/005A47A8; xprv 0/00000000; xid 19115; RM 10 info 00 len 76
0/005A4838: prv 0/005A47CC; xprv 0/005A47CC; xid 19115; RM 11 info 90 len 26
0/005A4874: prv 0/005A4838; xprv 0/005A4838; xid 19115; RM 10 info 00 len 80
0/005A48E4: prv 0/005A4874; xprv 0/005A4874; xid 19115; RM 11 info 90 len 26
0/005A4920: prv 0/005A48E4; xprv 0/005A48E4; xid 19115; RM 10 info 00 len 76
0/005A498C: prv 0/005A4920; xprv 0/005A4920; xid 19115; RM 11 info 90 len 26
0/005A49C8: prv 0/005A498C; xprv 0/005A498C; xid 19115; RM 10 info 00 len 76
0/005A4A34: prv 0/005A49C8; xprv 0/005A49C8; xid 19115; RM 11 info 90 len 26
0/005A4A70: prv 0/005A4A34; xprv 0/005A4A34; xid 19115; RM 10 info 00 len 65
0/005A4AD4: prv 0/005A4A70; xprv 0/005A4A70; xid 19115; RM 11 info 90 len 50
0/005A4B28: prv 0/005A4AD4; xprv 0/005A4AD4; xid 19115; RM  1 info 00 len 4
commit: 2001-02-26 17:26:02
ReadRecord: record with zero len at 0/005A4B4C
-- My dump program is unhappy here because the rest of the page is zero.
-- Given that there is a continuation record at the start of the next
-- page, there certainly should have been record(s) here.  But it's
-- worse than that: check the commit timestamps and the xid numbers
-- before and after the discontinuity.  Did time go backwards here?
-- Also notice the back-pointers in the first valid record on the next
-- page; they point not into the zeroed space, which would suggest a
-- mere failure to write a buffer after filling it, but into the middle
-- of one of the valid records on the prior page.  It almost looks like
-- page 5A6000 came from a completely different run than page 5A4000.
Unexpected page info flags 0001 at offset 5A6000
Skipping unexpected continuation record at offset 5A6000
0/005A6904: prv 0/005A48B4(?); xprv 0/005A48B4; xid 19047; RM 11 info 98 len 8226; bkpb 1
0/005A8954: prv 0/005A6904; xprv 0/005A6904; xid 19047; RM 10 info 00 len 72
0/005A89BC: prv 0/005A8954; xprv 0/005A8954; xid 19047; RM 11 info 90 len 26
0/005A89F8: prv 0/005A89BC; xprv 0/005A89BC; xid 19047; RM 10 info 00 len 72
0/005A8A60: prv 0/005A89F8; xprv 0/005A89F8; xid 19047; RM 11 info 90 len 26
0/005A8A9C: prv 0/005A8A60; xprv 0/005A8A60; xid 19047; RM 10 info 00 len 72
0/005A8B04: prv 0/005A8A9C; xprv 0/005A8A9C; xid 19047; RM 11 info 90 len 26
0/005A8B40: prv 0/005A8B04; xprv 0/005A8B04; xid 19047; RM 10 info 08 len 8226; bkpb 1
0/005AAB90: prv 0/005A8B40; xprv 0/005A8B40; xid 19047; RM 11 info 98 len 8226; bkpb 1
0/005ACBE0: prv 0/005AAB90; xprv 0/005AAB90; xid 19047; RM  1 info 00 len 4
commit: 2001-02-26 17:25:38
0/005ACC04: prv 0/005ACBE0; xprv 0/00000000; xid 19088; RM 10 info 00 len 72
0/005ACC6C: prv 0/005ACC04; xprv 0/005ACC04; xid 19088; RM 11 info 90 len 26
0/005ACCA8: prv 0/005ACC6C; xprv 0/005ACC6C; xid 19088; RM 10 info 00 len 72
0/005ACD10: prv 0/005ACCA8; xprv 0/005ACCA8; xid 19088; RM 11 info 90 len 26
0/005ACD4C: prv 0/005ACD10; xprv 0/005ACD10; xid 19088; RM 10 info 00 len 72
0/005ACDB4: prv 0/005ACD4C; xprv 0/005ACD4C; xid 19088; RM 11 info 90 len 26
0/005ACDF0: prv 0/005ACDB4; xprv 0/005ACDB4; xid 19088; RM 10 info 00 len 72
0/005ACE58: prv 0/005ACDF0; xprv 0/005ACDF0; xid 19088; RM 11 info 90 len 26
0/005ACE94: prv 0/005ACE58; xprv 0/005ACE58; xid 19088; RM 10 info 00 len 65
0/005ACEF8: prv 0/005ACE94; xprv 0/005ACE94; xid 19088; RM 11 info 90 len 50
0/005ACF4C: prv 0/005ACEF8; xprv 0/005ACEF8; xid 19088; RM  1 info 00 len 4
commit: 2001-02-26 17:26:43
0/005ACF70: prv 0/005ACF4C; xprv 0/00000000; xid 19109; RM 10 info 00 len 72
0/005ACFD8: prv 0/005ACF70; xprv 0/005ACF70; xid 19109; RM 11 info 90 len 26
0/005AD014: prv 0/005ACFD8; xprv 0/005ACFD8; xid 19109; RM 10 info 00 len 72
0/005AD07C: prv 0/005AD014; xprv 0/005AD014; xid 19109; RM 11 info 90 len 26
0/005AD0B8: prv 0/005AD07C; xprv 0/005AD07C; xid 19109; RM 10 info 00 len 72
0/005AD120: prv 0/005AD0B8; xprv 0/005AD0B8; xid 19109; RM 11 info 90 len 26
0/005AD15C: prv 0/005AD120; xprv 0/005AD120; xid 19109; RM 10 info 00 len 72
0/005AD1C4: prv 0/005AD15C; xprv 0/005AD15C; xid 19109; RM 11 info 90 len 26
0/005AD200: prv 0/005AD1C4; xprv 0/005AD1C4; xid 19109; RM 10 info 00 len 65
0/005AD264: prv 0/005AD200; xprv 0/005AD200; xid 19109; RM 11 info 98 len 8226; bkpb 1
0/005AF2B4: prv 0/005AD264; xprv 0/005AD264; xid 19109; RM  1 info 00 len 4
commit: 2001-02-26 17:26:59
0/005AF2D8: prv 0/005AF2B4; xprv 0/00000000; xid 19224; RM 10 info 00 len 72
0/005AF340: prv 0/005AF2D8; xprv 0/005AF2D8; xid 19224; RM 11 info 90 len 26
0/005AF37C: prv 0/005AF340; xprv 0/005AF340; xid 19224; RM 10 info 00 len 72
0/005AF3E4: prv 0/005AF37C; xprv 0/005AF37C; xid 19224; RM 11 info 90 len 26
0/005AF420: prv 0/005AF3E4; xprv 0/005AF3E4; xid 19224; RM 10 info 00 len 72
0/005AF488: prv 0/005AF420; xprv 0/005AF420; xid 19224; RM 11 info 90 len 26
0/005AF4C4: prv 0/005AF488; xprv 0/005AF488; xid 19224; RM 10 info 00 len 72
0/005AF52C: prv 0/005AF4C4; xprv 0/005AF4C4; xid 19224; RM 11 info 90 len 26
0/005AF568: prv 0/005AF52C; xprv 0/005AF52C; xid 19224; RM 10 info 00 len 65
0/005AF5CC: prv 0/005AF568; xprv 0/005AF568; xid 19224; RM 11 info 90 len 50
0/005AF620: prv 0/005AF5CC; xprv 0/005AF5CC; xid 19224; RM  1 info 00 len 4
commit: 2001-02-26 17:28:39
0/005AF644: prv 0/005AF620; xprv 0/00000000; xid 19229; RM 10 info 00 len 72
0/005AF6AC: prv 0/005AF644; xprv 0/005AF644; xid 19229; RM 11 info 90 len 26
0/005AF6E8: prv 0/005AF6AC; xprv 0/005AF6AC; xid 19229; RM 10 info 00 len 72
0/005AF750: prv 0/005AF6E8; xprv 0/005AF6E8; xid 19229; RM 11 info 90 len 26
0/005AF78C: prv 0/005AF750; xprv 0/005AF750; xid 19229; RM 10 info 00 len 72
0/005AF7F4: prv 0/005AF78C; xprv 0/005AF78C; xid 19229; RM 11 info 90 len 26
0/005AF830: prv 0/005AF7F4; xprv 0/005AF7F4; xid 19229; RM 10 info 00 len 72
0/005AF898: prv 0/005AF830; xprv 0/005AF830; xid 19229; RM 11 info 90 len 26
0/005AF8D4: prv 0/005AF898; xprv 0/005AF898; xid 19229; RM 10 info 00 len 65
0/005AF938: prv 0/005AF8D4; xprv 0/005AF8D4; xid 19229; RM 11 info 90 len 50
0/005AF98C: prv 0/005AF938; xprv 0/005AF938; xid 19229; RM  1 info 00 len 4
commit: 2001-02-26 17:28:50
0/005AF9B0: prv 0/005AF98C; xprv 0/00000000; xid 0; RM  0 info 00 len 32
checkpoint: redo 0/005AF9B0; undo 0/00000000; sui 30; nextxid 19243; nextoid 43387; online
-- This is the only checkpoint record present in the log after the
-- normal-looking one at 17:20.  There should have been checkpoints
-- at 17:25, 17:30, 17:35, 17:40, 17:45, not to mention one from the
-- eventual shutdown which seems to have been done around 17:49.
-- From the surrounding timestamps this one must be either 17:30 or 17:35.
-- What's even nastier (and the immediate cause of Scott's inability to
-- restart) is that the pg_control file's checkPoint pointer points to
-- 0/005AF9F0, which is *not* the location of this checkpoint, but of
-- the record after it.
-- Is that meaningful, or just random coincidence?  Can't tell yet.
-- Oh BTW, the timestamp in the pg_control file is 2001-02-26 17:34:09,
-- which does not correspond to any scheduled checkpoint.
0/005AF9F0: prv 0/005AF9B0; xprv 0/00000000; xid 19444; RM 10 info 08 len 8226; bkpb 1
0/005B1A40: prv 0/005AF9F0; xprv 0/005AF9F0; xid 19444; RM 11 info 98 len 8226; bkpb 1
0/005B3A90: prv 0/005B1A40; xprv 0/005B1A40; xid 19444; RM 10 info 00 len 80
0/005B3B00: prv 0/005B3A90; xprv 0/005B3A90; xid 19444; RM 11 info 90 len 26
0/005B3B3C: prv 0/005B3B00; xprv 0/005B3B00; xid 19444; RM 10 info 00 len 72
0/005B3BA4: prv 0/005B3B3C; xprv 0/005B3B3C; xid 19444; RM 11 info 90 len 26
0/005B3BE0: prv 0/005B3BA4; xprv 0/005B3BA4; xid 19444; RM 10 info 00 len 72
0/005B3C48: prv 0/005B3BE0; xprv 0/005B3BE0; xid 19444; RM 11 info 90 len 26
0/005B3C84: prv 0/005B3C48; xprv 0/005B3C48; xid 19444; RM 10 info 08 len 8226; bkpb 1
0/005B5CD4: prv 0/005B3C84; xprv 0/005B3C84; xid 19444; RM 11 info 98 len 8226; bkpb 1
0/005B7D24: prv 0/005B5CD4; xprv 0/005B5CD4; xid 19444; RM  1 info 00 len 4
commit: 2001-02-26 17:35:13
0/005B7D48: prv 0/005B7D24; xprv 0/00000000; xid 19495; RM 10 info 00 len 72
0/005B7DB0: prv 0/005B7D48; xprv 0/005B7D48; xid 19495; RM 11 info 90 len 26
0/005B7DEC: prv 0/005B7DB0; xprv 0/005B7DB0; xid 19495; RM 10 info 00 len 72
0/005B7E54: prv 0/005B7DEC; xprv 0/005B7DEC; xid 19495; RM 11 info 90 len 26
0/005B7E90: prv 0/005B7E54; xprv 0/005B7E54; xid 19495; RM 10 info 00 len 72
0/005B7EF8: prv 0/005B7E90; xprv 0/005B7E90; xid 19495; RM 11 info 90 len 26
0/005B7F34: prv 0/005B7EF8; xprv 0/005B7EF8; xid 19495; RM 10 info 00 len 72
0/005B7F9C: prv 0/005B7F34; xprv 0/005B7F34; xid 19495; RM 11 info 90 len 26
0/005B7FD8: prv 0/005B7F9C; xprv 0/005B7F9C; xid 19495; RM 10 info 00 len 69
0/005B804C: prv 0/005B7FD8; xprv 0/005B7FD8; xid 19495; RM 11 info 98 len 8226; bkpb 1
0/005BA09C: prv 0/005B804C; xprv 0/005B804C; xid 19495; RM  1 info 00 len 4
commit: 2001-02-26 17:36:32
0/005BA0C0: prv 0/005BA09C; xprv 0/00000000; xid 19527; RM 10 info 00 len 72
0/005BA128: prv 0/005BA0C0; xprv 0/005BA0C0; xid 19527; RM 11 info 90 len 26
0/005BA164: prv 0/005BA128; xprv 0/005BA128; xid 19527; RM 10 info 00 len 76
0/005BA1D0: prv 0/005BA164; xprv 0/005BA164; xid 19527; RM 11 info 90 len 26
0/005BA20C: prv 0/005BA1D0; xprv 0/005BA1D0; xid 19527; RM 10 info 00 len 72
0/005BA274: prv 0/005BA20C; xprv 0/005BA20C; xid 19527; RM 11 info 90 len 26
0/005BA2B0: prv 0/005BA274; xprv 0/005BA274; xid 19527; RM 10 info 00 len 72
0/005BA318: prv 0/005BA2B0; xprv 0/005BA2B0; xid 19527; RM 11 info 90 len 26
0/005BA354: prv 0/005BA318; xprv 0/005BA318; xid 19527; RM 10 info 00 len 65
0/005BA3B8: prv 0/005BA354; xprv 0/005BA354; xid 19527; RM 11 info 90 len 50
0/005BA40C: prv 0/005BA3B8; xprv 0/005BA3B8; xid 19527; RM  1 info 00 len 4
commit: 2001-02-26 17:37:59
0/005BA430: prv 0/005BA40C; xprv 0/00000000; xid 19540; RM 10 info 00 len 72
0/005BA498: prv 0/005BA430; xprv 0/005BA430; xid 19540; RM 11 info 90 len 26
0/005BA4D4: prv 0/005BA498; xprv 0/00000000; xid 19540; RM 15 info 00 len 100
0/005BA558: prv 0/005BA4D4; xprv 0/005BA4D4; xid 19540; RM 10 info 00 len 76
0/005BA5C4: prv 0/005BA558; xprv 0/005BA558; xid 19540; RM 11 info 90 len 26
0/005BA600: prv 0/005BA5C4; xprv 0/005BA5C4; xid 19540; RM 10 info 00 len 72
0/005BA668: prv 0/005BA600; xprv 0/005BA600; xid 19540; RM 11 info 90 len 26
0/005BA6A4: prv 0/005BA668; xprv 0/005BA668; xid 19540; RM 10 info 00 len 72
0/005BA70C: prv 0/005BA6A4; xprv 0/005BA6A4; xid 19540; RM 11 info 90 len 26
0/005BA748: prv 0/005BA70C; xprv 0/005BA70C; xid 19540; RM 10 info 00 len 65
0/005BA7AC: prv 0/005BA748; xprv 0/005BA748; xid 19540; RM 11 info 90 len 50
0/005BA800: prv 0/005BA7AC; xprv 0/005BA7AC; xid 19540; RM  1 info 00 len 4
commit: 2001-02-26 17:39:03
0/005BA824: prv 0/005BA800; xprv 0/00000000; xid 19605; RM 10 info 00 len 72
0/005BA88C: prv 0/005BA824; xprv 0/005BA824; xid 19605; RM 11 info 90 len 26
0/005BA8C8: prv 0/005BA88C; xprv 0/005BA88C; xid 19605; RM 10 info 00 len 72
0/005BA930: prv 0/005BA8C8; xprv 0/005BA8C8; xid 19605; RM 11 info 90 len 26
0/005BA96C: prv 0/005BA930; xprv 0/005BA930; xid 19605; RM 10 info 00 len 72
0/005BA9D4: prv 0/005BA96C; xprv 0/005BA96C; xid 19605; RM 11 info 90 len 26
0/005BAA10: prv 0/005BA9D4; xprv 0/005BA9D4; xid 19605; RM 10 info 00 len 72
0/005BAA78: prv 0/005BAA10; xprv 0/005BAA10; xid 19605; RM 11 info 90 len 26
0/005BAAB4: prv 0/005BAA78; xprv 0/005BAA78; xid 19605; RM 10 info 00 len 65
0/005BAB18: prv 0/005BAAB4; xprv 0/005BAAB4; xid 19605; RM 11 info 90 len 50
0/005BAB6C: prv 0/005BAB18; xprv 0/005BAB18; xid 19605; RM  1 info 00 len 4
commit: 2001-02-26 17:41:09
0/005BAB90: prv 0/005BAB6C; xprv 0/00000000; xid 19610; RM 10 info 00 len 72
0/005BABF8: prv 0/005BAB90; xprv 0/005BAB90; xid 19610; RM 11 info 90 len 26
0/005BAC34: prv 0/005BABF8; xprv 0/005BABF8; xid 19610; RM 10 info 00 len 72
0/005BAC9C: prv 0/005BAC34; xprv 0/005BAC34; xid 19610; RM 11 info 90 len 26
0/005BACD8: prv 0/005BAC9C; xprv 0/005BAC9C; xid 19610; RM 10 info 00 len 72
0/005BAD40: prv 0/005BACD8; xprv 0/005BACD8; xid 19610; RM 11 info 90 len 26
0/005BAD7C: prv 0/005BAD40; xprv 0/005BAD40; xid 19610; RM 10 info 00 len 72
0/005BADE4: prv 0/005BAD7C; xprv 0/005BAD7C; xid 19610; RM 11 info 90 len 26
0/005BAE20: prv 0/005BADE4; xprv 0/005BADE4; xid 19610; RM 10 info 00 len 65
0/005BAE84: prv 0/005BAE20; xprv 0/005BAE20; xid 19610; RM 11 info 90 len 50
0/005BAED8: prv 0/005BAE84; xprv 0/005BAE84; xid 19610; RM  1 info 00 len 4
commit: 2001-02-26 17:41:11
0/005BAEFC: prv 0/005BAED8; xprv 0/00000000; xid 19718; RM 10 info 00 len 72
0/005BAF64: prv 0/005BAEFC; xprv 0/005BAEFC; xid 19718; RM 11 info 90 len 26
0/005BAFA0: prv 0/005BAF64; xprv 0/005BAF64; xid 19718; RM 10 info 00 len 72
0/005BB008: prv 0/005BAFA0; xprv 0/005BAFA0; xid 19718; RM 11 info 90 len 26
0/005BB044: prv 0/005BB008; xprv 0/005BB008; xid 19718; RM 10 info 00 len 72
0/005BB0AC: prv 0/005BB044; xprv 0/005BB044; xid 19718; RM 11 info 90 len 26
0/005BB0E8: prv 0/005BB0AC; xprv 0/005BB0AC; xid 19718; RM 10 info 00 len 72
0/005BB150: prv 0/005BB0E8; xprv 0/005BB0E8; xid 19718; RM 11 info 90 len 26
0/005BB18C: prv 0/005BB150; xprv 0/005BB150; xid 19718; RM 10 info 00 len 65
0/005BB1F0: prv 0/005BB18C; xprv 0/005BB18C; xid 19718; RM 11 info 90 len 50
0/005BB244: prv 0/005BB1F0; xprv 0/005BB1F0; xid 19718; RM  1 info 00 len 4
commit: 2001-02-26 17:44:57
0/005BB268: prv 0/005BB244; xprv 0/00000000; xid 19775; RM 10 info 00 len 72
0/005BB2D0: prv 0/005BB268; xprv 0/005BB268; xid 19775; RM 11 info 90 len 26
0/005BB30C: prv 0/005BB2D0; xprv 0/005BB2D0; xid 19775; RM 10 info 00 len 72
0/005BB374: prv 0/005BB30C; xprv 0/005BB30C; xid 19775; RM 11 info 90 len 26
0/005BB3B0: prv 0/005BB374; xprv 0/005BB374; xid 19775; RM 10 info 00 len 72
0/005BB418: prv 0/005BB3B0; xprv 0/005BB3B0; xid 19775; RM 11 info 90 len 26
0/005BB454: prv 0/005BB418; xprv 0/005BB418; xid 19775; RM 10 info 00 len 72
0/005BB4BC: prv 0/005BB454; xprv 0/005BB454; xid 19775; RM 11 info 90 len 26
0/005BB4F8: prv 0/005BB4BC; xprv 0/005BB4BC; xid 19775; RM 10 info 00 len 65
0/005BB55C: prv 0/005BB4F8; xprv 0/005BB4F8; xid 19775; RM 11 info 90 len 50
0/005BB5B0: prv 0/005BB55C; xprv 0/005BB55C; xid 19775; RM  1 info 00 len 4
commit: 2001-02-26 17:47:38
0/005BB5D4: prv 0/005BB5B0; xprv 0/00000000; xid 19827; RM 10 info 00 len 72
0/005BB63C: prv 0/005BB5D4; xprv 0/005BB5D4; xid 19827; RM 11 info 90 len 26
0/005BB678: prv 0/005BB63C; xprv 0/005BB63C; xid 19827; RM 10 info 00 len 72
0/005BB6E0: prv 0/005BB678; xprv 0/005BB678; xid 19827; RM 11 info 90 len 26
0/005BB71C: prv 0/005BB6E0; xprv 0/005BB6E0; xid 19827; RM 10 info 00 len 72
0/005BB784: prv 0/005BB71C; xprv 0/005BB71C; xid 19827; RM 11 info 90 len 26
0/005BB7C0: prv 0/005BB784; xprv 0/005BB784; xid 19827; RM 10 info 00 len 72
0/005BB828: prv 0/005BB7C0; xprv 0/005BB7C0; xid 19827; RM 11 info 90 len 26
0/005BB864: prv 0/005BB828; xprv 0/005BB828; xid 19827; RM 10 info 00 len 65
0/005BB8C8: prv 0/005BB864; xprv 0/005BB864; xid 19827; RM 11 info 90 len 50
0/005BB91C: prv 0/005BB8C8; xprv 0/005BB8C8; xid 19827; RM  1 info 00 len 4
commit: 2001-02-26 17:49:00
0/005BB940: prv 0/005BB91C; xprv 0/00000000; xid 19832; RM 10 info 00 len 72
0/005BB9A8: prv 0/005BB940; xprv 0/005BB940; xid 19832; RM 11 info 90 len 26
0/005BB9E4: prv 0/005BB9A8; xprv 0/005BB9A8; xid 19832; RM 10 info 00 len 72
0/005BBA4C: prv 0/005BB9E4; xprv 0/005BB9E4; xid 19832; RM 11 info 90 len 26
0/005BBA88: prv 0/005BBA4C; xprv 0/005BBA4C; xid 19832; RM 10 info 00 len 72
0/005BBAF0: prv 0/005BBA88; xprv 0/005BBA88; xid 19832; RM 11 info 90 len 26
0/005BBB2C: prv 0/005BBAF0; xprv 0/005BBAF0; xid 19832; RM 10 info 00 len 72
0/005BBB94: prv 0/005BBB2C; xprv 0/005BBB2C; xid 19832; RM 11 info 90 len 26
0/005BBBD0: prv 0/005BBB94; xprv 0/005BBB94; xid 19832; RM 10 info 00 len 65
0/005BBC34: prv 0/005BBBD0; xprv 0/005BBBD0; xid 19832; RM 11 info 90 len 50
0/005BBC88: prv 0/005BBC34; xprv 0/005BBC34; xid 19832; RM  1 info 00 len 4
commit: 2001-02-26 17:49:06
ReadRecord: record with zero len at 0/005BBCAC
-- this is where the log actually ends --- zeroes from here out.


Re: WAL & RC1 status

От
The Hermit Hacker
Дата:
On Fri, 2 Mar 2001, Tom Lane wrote:

> I am *not* feeling good about pushing out an RC1 release candidate
> today.
>
> I've been going through the WAL code, trying to understand it and
> document it.  I've found a number of minor problems and several major
> ones ("major" meaning "can't really fix without an incompatible file
> format change, hence initdb").  I've reported the major problems to
> the mailing lists but gotten almost no feedback about what to do.
>
> In addition, I'm still looking for the bug that I originally went in to
> find: Scott Parish's report of being unable to restart after a normal
> shutdown of beta4.  Examination of his WAL log shows some pretty serious
> lossage (see attached dump).  My current theory is that the
> buffer-slinging logic in xlog.c dropped one or more whole buffers' worth
> of log records, but I haven't figured out exactly how.
>
> I want to veto putting out an RC1 until these issues are resolved...
> comments?

Will second it ... Vadim is supposed to be back on the 6th, and Peter has
a couple of changes to configure he wants to do this weekend for the JDBC
stuff ... Thomas and I are in SF the end of next week for some meetings,
so if you can pop off a summary of what you've found to either of us, and
assuming that Vadim doesn't get caught up by then, we can bring them up
"in person" at that time ... ?




Re: WAL & RC1 status

От
Bruce Momjian
Дата:
> I am *not* feeling good about pushing out an RC1 release candidate
> today.
> 
> I've been going through the WAL code, trying to understand it and
> document it.  I've found a number of minor problems and several major
> ones ("major" meaning "can't really fix without an incompatible file
> format change, hence initdb").  I've reported the major problems to
> the mailing lists but gotten almost no feedback about what to do.
> 
> In addition, I'm still looking for the bug that I originally went in to
> find: Scott Parish's report of being unable to restart after a normal
> shutdown of beta4.  Examination of his WAL log shows some pretty serious
> lossage (see attached dump).  My current theory is that the
> buffer-slinging logic in xlog.c dropped one or more whole buffers' worth
> of log records, but I haven't figured out exactly how.
> 
> I want to veto putting out an RC1 until these issues are resolved...
> comments?

I was not sure how to respond.  Requiring an initdb at this stage seems
like it could be a pretty major blow to beta testers.  However, if we
will have 7.1 problems with WAL that can not be fixed without a file
format change, we will have problems down the road.  Is there a version
number in the WAL file?  Can we put conditional code in there to create
new log file records with an updated format?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: WAL & RC1 status

От
Bruce Momjian
Дата:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Is there a version number in the WAL file?
> 
> catversion.h will do fine, no?
> 
> > Can we put conditional code in there to create
> > new log file records with an updated format?
> 
> The WAL stuff is *far* too complex already.  I've spent a week studying
> it and I only partially understand it.  I will not consent to trying to
> support multiple log file formats concurrently.

Well, I was thinking a few things.  Right now, if we update the
catversion.h, we will require a dump/reload.  If we can update just the
WAL version stamp, that will allow us to fix WAL format problems without
requiring people to dump/reload.  I can imagine this would be valuable
if we find we need to make changes in 7.1.1, where we can not require
dump/reload.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: WAL & RC1 status

От
Tom Lane
Дата:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Is there a version number in the WAL file?

catversion.h will do fine, no?

> Can we put conditional code in there to create
> new log file records with an updated format?

The WAL stuff is *far* too complex already.  I've spent a week studying
it and I only partially understand it.  I will not consent to trying to
support multiple log file formats concurrently.
        regards, tom lane


Re: WAL & RC1 status

От
Tom Lane
Дата:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Well, I was thinking a few things.  Right now, if we update the
> catversion.h, we will require a dump/reload.  If we can update just the
> WAL version stamp, that will allow us to fix WAL format problems without
> requiring people to dump/reload.

Since there is not a separate WAL version stamp, introducing one now
would certainly force an initdb.  I don't mind adding one if you think
it's useful; another 4 bytes in pg_control won't hurt anything.  But
it's not going to save anyone's bacon on this cycle.

At least one of my concerns (single point of failure) would require a
change to the layout of pg_control, which would force initdb anyway.
Anyone want to propose a third version# for pg_control?
        regards, tom lane


Re: WAL & RC1 status

От
Bruce Momjian
Дата:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Well, I was thinking a few things.  Right now, if we update the
> > catversion.h, we will require a dump/reload.  If we can update just the
> > WAL version stamp, that will allow us to fix WAL format problems without
> > requiring people to dump/reload.
> 
> Since there is not a separate WAL version stamp, introducing one now
> would certainly force an initdb.  I don't mind adding one if you think
> it's useful; another 4 bytes in pg_control won't hurt anything.  But
> it's not going to save anyone's bacon on this cycle.

Having a version number of binary files has saved me many times because
I can add a little 'if' to allow upward binary compatibility without
breaking old binary files.  I think we should have one.

I see our btree files, but I don't see one in heap.  I am going to
recommend that for 7.2.  All our files should have versions just in case
we ever need it.  Some day, we may be able to skip dump/reload for major
versions.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: WAL & RC1 status

От
Thomas Lockhart
Дата:
> I've been going through the WAL code, trying to understand it and
> document it.  I've found a number of minor problems and several major
> ones ("major" meaning "can't really fix without an incompatible file
> format change, hence initdb").  I've reported the major problems to
> the mailing lists but gotten almost no feedback about what to do.

Sorry for the "no feedback", but I've assumed that this will be more
productively discussed with Vadim in the loop. I don't disagree with
your observations, but of course that is from a position of happy
ignorance :)

> ... I want to veto putting out an RC1 until these issues are resolved...
> comments?

OK with me.
                   - Thomas


RE: WAL & RC1 status

От
Matthew
Дата:

> From:    Bruce Momjian [SMTP:pgman@candle.pha.pa.us]
> Sent:    Friday, March 02, 2001 9:54 AM
> To:    Tom Lane
> Cc:    pgsql-core@postgresql.org; pgsql-hackers@postgresql.org
> Subject:    Re: [HACKERS] WAL & RC1 status
> 
> > Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > > Is there a version number in the WAL file?
> > 
> > catversion.h will do fine, no?
> > 
> > > Can we put conditional code in there to create
> > > new log file records with an updated format?
> > 
> While it may be unfortunate to have to do an initdb at this point in
the beta cycle, it is a beta and that is part of the deal.  Postgre has the
reputation of being the highest quality opensource database and we should do
nothing to tarnish that.  Release it when it's ready and not before.


Re: WAL & RC1 status

От
Bruce Momjian
Дата:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Well, I was thinking a few things.  Right now, if we update the
> > catversion.h, we will require a dump/reload.  If we can update just the
> > WAL version stamp, that will allow us to fix WAL format problems without
> > requiring people to dump/reload.
> 
> Since there is not a separate WAL version stamp, introducing one now
> would certainly force an initdb.  I don't mind adding one if you think
> it's useful; another 4 bytes in pg_control won't hurt anything.  But
> it's not going to save anyone's bacon on this cycle.
> 
> At least one of my concerns (single point of failure) would require a
> change to the layout of pg_control, which would force initdb anyway.
> Anyone want to propose a third version# for pg_control?

I now remember Hiroshi complaining about major WAL problems also,
particularly corrupt WAL files preventing the database from starting.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: WAL & RC1 status

От
ncm@zembu.com (Nathan Myers)
Дата:
On Fri, Mar 02, 2001 at 10:54:04AM -0500, Bruce Momjian wrote:
> > Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > > Is there a version number in the WAL file?
> > 
> > catversion.h will do fine, no?
> > 
> > > Can we put conditional code in there to create
> > > new log file records with an updated format?
> > 
> > The WAL stuff is *far* too complex already.  I've spent a week studying
> > it and I only partially understand it.  I will not consent to trying to
> > support multiple log file formats concurrently.
> 
> Well, I was thinking a few things.  Right now, if we update the
> catversion.h, we will require a dump/reload.  If we can update just the
> WAL version stamp, that will allow us to fix WAL format problems without
> requiring people to dump/reload.  I can imagine this would be valuable
> if we find we need to make changes in 7.1.1, where we can not require
> dump/reload.

It Seems to Me that after an orderly shutdown, the WAL files should be, 
effectively, slag -- they should contain no deltas from the current 
table contents.  In practice that means the only part of the format that 
*should* matter is whatever it takes to discover that they really are 
slag.

That *should* mean that, at worst, a change to the WAL file format should 
only require doing an orderly shutdown, and then (perhaps) running a simple
program to generate a new-format empty WAL.  It ought not to require an 
initdb.  

Of course the details of the current implementation may interfere with
that ideal, but it seems a worthy goal for the next beta, if it's not
possible already.  Given the opportunity to change the current WAL format, 
it ought to be possible to avoid even needing to run a program to generate 
an empty WAL.

Nathan Myers
ncm@zembu.com


Re: WAL & RC1 status

От
Bruce Momjian
Дата:
> It Seems to Me that after an orderly shutdown, the WAL files should be, 
> effectively, slag -- they should contain no deltas from the current 
> table contents.  In practice that means the only part of the format that 
> *should* matter is whatever it takes to discover that they really are 
> slag.

> 
> That *should* mean that, at worst, a change to the WAL file format should 
> only require doing an orderly shutdown, and then (perhaps) running a simple
> program to generate a new-format empty WAL.  It ought not to require an 
> initdb.  
> 
> Of course the details of the current implementation may interfere with
> that ideal, but it seems a worthy goal for the next beta, if it's not
> possible already.  Given the opportunity to change the current WAL format, 
> it ought to be possible to avoid even needing to run a program to generate 
> an empty WAL.

This was my question too.  If we are just changing WAL, why can't we
just have them stop the postmaster, install the new binaries, and
restart.

Tom told me on the phone that there was a magic number in the WAL log
file, and I see it now:
#define XLOG_PAGE_MAGIC 0x17345168

Couldn't we just have our new beta ignore WAL pages with this entry,
knowing that startup/shutdown creates new WAL files anyway, 

Aside from inconveniencing the beta users, people can do testing easier
if we don't require a dump/reload for every WAL format change.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: WAL & RC1 status

От
Tom Lane
Дата:
ncm@zembu.com (Nathan Myers) writes:
> It Seems to Me that after an orderly shutdown, the WAL files should be, 
> effectively, slag -- they should contain no deltas from the current 
> table contents.  In practice that means the only part of the format that 
> *should* matter is whatever it takes to discover that they really are 
> slag.

> That *should* mean that, at worst, a change to the WAL file format should 
> only require doing an orderly shutdown, and then (perhaps) running a simple
> program to generate a new-format empty WAL.  It ought not to require an 
> initdb.  

Excellent point, considering that we were already thinking of making a
handy-dandy little utility to remove broken WAL files...  Shouldn't take
much more than that to build something that also reformats pg_control.
Thanks for the suggestion!
        regards, tom lane