Обсуждение: BUG #14195: "MultiXactId XXXXXX has not been created yet -- apparent wraparound" after upgrade from 9.2
BUG #14195: "MultiXactId XXXXXX has not been created yet -- apparent wraparound" after upgrade from 9.2
От
sean.hope@eroad.com
Дата:
VGhlIGZvbGxvd2luZyBidWcgaGFzIGJlZW4gbG9nZ2VkIG9uIHRoZSB3ZWJz aXRlOgoKQnVnIHJlZmVyZW5jZTogICAgICAxNDE5NQpMb2dnZWQgYnk6ICAg ICAgICAgIFNlYW4gSG9wZQpFbWFpbCBhZGRyZXNzOiAgICAgIHNlYW4uaG9w ZUBlcm9hZC5jb20KUG9zdGdyZVNRTCB2ZXJzaW9uOiA5LjQuNwpPcGVyYXRp bmcgc3lzdGVtOiAgIFVidW50dSAxMi4wNCBMVFMKRGVzY3JpcHRpb246ICAg ICAgICAKCkhpLA0KDQpXZSdyZSBoYXZpbmcgaXNzdWVzIHdpdGggJ011bHRp WGFjdElkIHh4eHh4eHh4eCBoYXMgbm90IGJlZW4gY3JlYXRlZCB5ZXQgLS0K YXBwYXJlbnQgd3JhcGFyb3VuZCcgbG9nIG1lc3NhZ2VzIHVwb24gdGFibGUg dmFjdXVtcy4gVGhlIHZhY3V1bXMgYXJlCmNhbmNlbGxlZCBieSB0aGUgc3lz dGVtIGFmdGVyIHRoaXMgbWVzc2FnZSBpcyBpc3N1ZWQuDQoNClRoZXNlIG1l c3NhZ2VzIGhhdmUgb25seSBiZWVuIHNlZW4gc2luY2UgdXBncmFkaW5nIGZy b20gOS4yLnggdG8gOS40LjcgdmlhCnBnX3VwZ3JhZGUgYWJvdXQgYSBtb250 aCBhZ28uIFRoaXMgaXMgYSBsYXJnZSAvIGhpZ2ggdHJhbnNhY3Rpb24gcmF0 ZSAoMTEyMAp0cHMpIGRhdGFiYXNlLiBXZSBhcmUgdW5hYmxlIHRvIHZhY3V1 bSB0aGUgYWZmZWN0ZWQgdGFibGVzICg1IG9mIHRoZW0pCmN1cnJlbnRseSBk dWUgdG8gdGhpcyBpc3N1ZS4NCg0KQW55IGhlbHAgbW9zdCB3ZWxjb21lLg0K DQpSZWdhcmRzLA0KU2VhbgoK
>>>>> "sean" == sean hope <sean.hope@eroad.com> writes: sean> The following bug has been logged on the website: sean> Bug reference: 14195 sean> Logged by: Sean Hope sean> Email address: sean.hope@eroad.com sean> PostgreSQL version: 9.4.7 sean> Operating system: Ubuntu 12.04 LTS sean> Description: sean> Hi, sean> We're having issues with 'MultiXactId xxxxxxxxx has not been sean> created yet -- apparent wraparound' log messages upon table sean> vacuums. The vacuums are cancelled by the system after this sean> message is issued. sean> These messages have only been seen since upgrading from 9.2.x to sean> 9.4.7 via pg_upgrade about a month ago. This is a large / high sean> transaction rate (1120 tps) database. We are unable to vacuum the sean> affected tables (5 of them) currently due to this issue. This one came up on the IRC channel and I did some analysis there. What it seems to boil down to is that, faced with a tuple with a pre-upgrade mxid, we get this call path: vacuum calls heap_prepare_freeze_tuple heap_prepare_freeze_tuple calls FreezeMultiXactId (because HEAP_XMAX_IS_MULTI) FreezeMultiXactId checks if the mxid is older than the cutoff, but because this is a pre-upgrade mxid it can be literally anything, and in this case it's in the future not the past, so that check is skipped FreezeMultiXactId calls GetMultiXactIdMembers with allow_old=true GetMultiXactIdMembers checks for wraparound, but while it downgrades the error to DEBUG1 if allow_old is true and the mxid is in the past, it unconditionally errors if the mxid is in the future. From pageinspect output this is one of the failing tuples: blk: 291; lp: 35; lp_off: 7992; lp_flags: 1; lp_len: 196; t_xmin: 2; t_xmax: 197529992; t_field3: 21; t_ctid: (291,35); t_infomask2: 32791; t_infomask: 14723; (0x3983) t_hoff: 32; bits: "111111111111111111111100000000000000000000000000000000000000000000000000"; (note: the pageinspect output was all absolutely consistent, there were no signs of any data corruption of any kind) from pg_controldata: Latest checkpoint's NextMultiXactId: 3125330305 Latest checkpoint's NextMultiOffset: 2217410182 Latest checkpoint's oldestMultiXid: 2741572175 -- Andrew (irc:RhodiumToad)
Andrew Gierth wrote: > This one came up on the IRC channel and I did some analysis there. > > What it seems to boil down to is that, faced with a tuple with a > pre-upgrade mxid, we get this call path: > > vacuum calls heap_prepare_freeze_tuple > > heap_prepare_freeze_tuple calls FreezeMultiXactId > (because HEAP_XMAX_IS_MULTI) > > FreezeMultiXactId checks if the mxid is older than the cutoff, but > because this is a pre-upgrade mxid it can be literally anything, and in > this case it's in the future not the past, so that check is skipped > > FreezeMultiXactId calls GetMultiXactIdMembers with allow_old=true > > GetMultiXactIdMembers checks for wraparound, but while it downgrades the > error to DEBUG1 if allow_old is true and the mxid is in the past, it > unconditionally errors if the mxid is in the future. I was in the middle of writing up this in another thread, and your analysis in spot on. https://www.postgresql.org/message-id/20160615233427.GA18976@alvherre.pgsql -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
sean.hope@eroad.com wrote: > Hi, > > We're having issues with 'MultiXactId xxxxxxxxx has not been created yet -- > apparent wraparound' log messages upon table vacuums. The vacuums are > cancelled by the system after this message is issued. > > These messages have only been seen since upgrading from 9.2.x to 9.4.7 via > pg_upgrade about a month ago. This is a large / high transaction rate (1120 > tps) database. We are unable to vacuum the affected tables (5 of them) > currently due to this issue. This is fixed now, finally. https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=61b24fef89a9f7e381e6d83029b63e062d32812b The other thread I linked to earlier has more discussion. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services