Обсуждение: Xmax precedes relation freeze threshold errors
Hi,
We're on Postgresql version 14.1. After running pg_amcheck on one of
our databases, we see lots of the same errors for a lot of tables.
Here's an example run for a single table:
/usr/lib/postgresql/14/bin/pg_amcheck -h /var/run/pgsql15/ -d poker -t
public.user_tournament --heapallindexed
heap table “poker.public.user_tournament”, block 30356, offset 125:
xmax 634989520 precedes relation freeze threshold 12:634871433
heap table “poker.public.user_tournament”, block 30356, offset 145:
xmin 634989520 precedes relation freeze threshold 12:634871433
heap table “poker.public.user_tournament”, block 30601, offset 106:
xmax 634989520 precedes relation freeze threshold 12:634871433
heap table “poker.public.user_tournament”, block 30601, offset 111:
xmin 634989520 precedes relation freeze threshold 12:634871433
heap table “poker.public.user_tournament”, block 30645, offset 2:
xmax 634989520 precedes relation freeze threshold 12:634871433
heap table “poker.public.user_tournament”, block 30645, offset 3:
xmin 634989520 precedes relation freeze threshold 12:634871433
heap table “poker.public.user_tournament”, block 31416, offset 8:
xmax 634989520 precedes relation freeze threshold 12:634871433
heap table “poker.public.user_tournament”, block 31416, offset 9:
xmin 634989520 precedes relation freeze threshold 12:634871433
VACUUM FREEZE on this table completes successfully, but pg_amcheck
errors still remain. What are the implications behind this error? Does
this mean some kind of a silent data corruption? Or are they harmless?
Are there any ways to fix them/prevent them appearing in the future?
Best regards,
Sergey Aleynikov
On Tue, Jun 14, 2022, at 2:27 PM, Sergey Aleynikov wrote: > /usr/lib/postgresql/14/bin/pg_amcheck -h /var/run/pgsql15/ -d poker -t > public.user_tournament --heapallindexed > heap table “poker.public.user_tournament”, block 30356, offset 125: > xmax 634989520 precedes relation freeze threshold 12:634871433 Hmm, some wild guesses: - There used to be a bug whereby VACUUM failed to truncate pages at the end of a relation when all their tuples were removed;in order for this to happen, you need some disaster to occur at the end of vacuum (such as a system crash at justthat time, or a NFS failure). This would cause these tuples, which should have disappeared, to remain. This is hardto solve: you have to figure out which tuples were those that should have been removed, and delete them, while at thesame time retaining any tuples that were added to those pages afterwards. This is not super common, but it's definitelya possibility. - Maybe there's some bug in amcheck that causes it to report tuple with an old xmax but which in reality are frozen? I don'tthink this is very likely, but In order to discard this hypothesis, you'd have to show the output of `heap_page_items`from the pages in question, or at least give some thought to the bits in `t_infomask`. - Maybe you promoted a standby in some wrong way. I don't know what this entails, but I've seen it claimed that failing tofollow the documented procedures exactly, you might end up with broken data pages. If these block numbers are, or were at some time in the past, near the end of the table, then the first possibility soundsthe most likely of these three. However, if you have dozens of tables with the same problem, there might be somethingelse going on.
On Tue, Jun 14, 2022 at 7:12 PM Álvaro Herrera <alvherre@alvh.no-ip.org> wrote: > - Maybe you promoted a standby in some wrong way. I don't know what this entails, but I've seen it claimed that failingto follow the documented procedures exactly, you might end up with broken data pages. This instance is very unlikely to ever be promoted to master, at least not in the recent 5 years while it resides at it's current machine). The instance has been upgraded a number of times though with pg_upgrade, recently from 11->14. > - Maybe there's some bug in amcheck that causes it to report tuple with an old xmax but which in reality are frozen? Idon't think this is very likely, but In order to discard this hypothesis, you'd have to show the output of `heap_page_items`from the pages in question, or at least give some thought to the bits in `t_infomask`. I've run verify_heapam() in a loop for a single table. Some runs it shows nothing (i've seen three consequent empty runs), and on each run it shows something - it reports different pages. Running verify_heapam() in a transaction still results in the same kind of different output. Running it under a LOCK TABLES always (11 runs) results in no output. So I can't get a page to dump it's content - it's never the same. Does this means it's unlikely to be a VACUUM-related issue? Best regards, Sergey Aleynikov
On 2022-Jun-14, Sergey Aleynikov wrote:
> On Tue, Jun 14, 2022 at 7:12 PM Álvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> > - Maybe there's some bug in amcheck that causes it to report tuple
> > with an old xmax but which in reality are frozen? I don't think
> > this is very likely, but In order to discard this hypothesis, you'd
> > have to show the output of `heap_page_items` from the pages in
> > question, or at least give some thought to the bits in `t_infomask`.
>
> I've run verify_heapam() in a loop for a single table. Some runs it
> shows nothing (i've seen three consequent empty runs), and on each run
> it shows something - it reports different pages.
Oh, that definitely sounds like a bug in amcheck then. Please post the
output of
select * from heap_page_items(get_raw_page('the table', the_page));
for some of these pages, preferrably just around the time the errors for
these pages are shown; maybe someone can figure out which tuples cause
these errors and how to fix amcheck. It might be difficult to catch
tuples 'in the act' though.
Maybe you can script things so that amcheck is run and the output feeds
a script that runs heap_page_items.
--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
"Digital and video cameras have this adjustment and film cameras don't for the
same reason dogs and cats lick themselves: because they can." (Ken Rockwell)
On Tue, Jun 14, 2022 at 10:43 AM Sergey Aleynikov <sergey.aleynikov@gmail.com> wrote: > This instance is very unlikely to ever be promoted to master, at least > not in the recent 5 years while it resides at it's current machine). > The instance has been upgraded a number of times though with > pg_upgrade, recently from 11->14. If I had to guess, I'd guess that it is a pg_upgrade issue. Commit 74cf7d46 from 2021-07-26 fixed an issue in pg_upgrade on all stable branches. Prior to that commit, pg_upgrade failed to account for the fact that pg_resetxlog/pg_resetwal carried forward a relation's pg_class.relfrozenxid in a way that wasn't appropriate for pg_upgrade (we should just carry forward the relfrozenxid value without changing anything in pg_upgrade). I wouldn't be surprised to learn that this only visibly led to problems with one particular heap relation due to "phase of the moon" type details tied to the specifics at the time of the upgrade. On the other hand, I don't know what to make of the fact that it seems to come and go. Maybe it is just an amcheck bug. -- Peter Geoghegan
On Tue, Jun 14, 2022 at 8:48 PM Álvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> Oh, that definitely sounds like a bug in amcheck then. Please post the
> output of
> select * from heap_page_items(get_raw_page('the table', the_page));
> for some of these pages, preferrably just around the time the errors for
> these pages are shown; maybe someone can figure out which tuples cause
> these errors and how to fix amcheck. It might be difficult to catch
> tuples 'in the act' though.
I've switched to another table with less update traffic and was able
to catch the following:
(verify_heapam before)
blkno | offnum | attnum | msg
-------+--------+--------+-----------------------------------------------------------------
20365 | 8 | | xmax 1965224948 precedes relation freeze
threshold 1:1914858032
20365 | 12 | | xmin 1965224948 precedes relation freeze
threshold 1:1914858032
(2 rows)
(heap_page_items dump) - https://dpaste.org/fWzEj/raw
(verify_heapam after)
blkno | offnum | attnum | msg
-------+--------+--------+-----------------------------------------------------------------
20365 | 12 | | xmax 1965224955 precedes relation freeze
threshold 1:1914858032
20365 | 15 | | xmin 1965224955 precedes relation freeze
threshold 1:1914858032
(2 rows)
If this is not enough, I can try to catch even more matching before/after pairs.
Best regards,
Sergey Aleynikov
On 2022-Jun-16, Sergey Aleynikov wrote:
> (verify_heapam before)
> blkno | offnum | attnum | msg
> -------+--------+--------+-----------------------------------------------------------------
> 20365 | 8 | | xmax 1965224948 precedes relation freeze threshold 1:1914858032
> 20365 | 12 | | xmin 1965224948 precedes relation freeze threshold 1:1914858032
> (2 rows)
Hmm, well, these errors make no sense, because 1965224948 is definitely
not earlier than the stated threshold of 1914858032. There might be a
bug in the amcheck conversion of the epochless XID value to FullXid, or
in the comparison functions.
For the record, the values for this page are below. For (20365,8) the
infomask is 0x2102 which is HEAP_UPDATED | HEAP_XMIN_COMMITTED; there's
no bit for Xmax. The next one in the chain is the other
complained-about tuple, (20365,12) which has a matching Xmin (no
surprise there) and the infomask is 0x2092 which is HEAP_UPDATED |
HEAP_XMAX_LOCK_ONLY | HEAP_XMAX_KEYSHR_LOCK. So this is a tuple that
was updated and then the new version was used by SELECT FOR KEY SHARE.
Nothing strange here.
lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff
|t_bits | t_oid |
t_data
----+--------+----------+--------+------------+------------+----------+------------+-------------+------------+--------+--------+-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1 | 25 | 2 | 0 | | | | | | |
| | |
2 | 4504 | 1 | 204 | 1965224930 | 1965224938 | 6 | (20365,8) | 57347 | 9474 | 24
| | |
\016\301\177\005\000\000\000\000\300\271\261\265\215\204\002\000\220\002\000\000[[1,0,1655367415],[2,0,1655368111],[3,0,1655316514],[4,1,1345141307],[7,0,1655368111],[10,46,1655368114],[12,0,1655316561],[13,2,1602489365],[16,20,1655368127]]
3 | 8040 | 1 | 147 | 1852898199 | 0 | 38 | (20365,3) | 3 | 11010 | 24
| | |
\205<\207\005\000\000\000\000\200\317f\023/\202\002\000\327[[1,0,1649925421],[2,3,1652762654],[4,1,1336885001],[7,0,1652762654],[10,49,1649925346],[16,0,1649925383]]
4 | 27 | 2 | 0 | | | | | | |
| | |
5 | 7768 | 1 | 272 | 1964963817 | 1964963897 | 7 | (20365,5) | 40963 | 10690 | 24
| | |
\035(\345\005\000\000\000\000\300\301\017\300\213\204\002\000\240\003\000\000[[1,0,1655356016],[2,0,1655355988],[3,0,1655359711],[4,1,1318680670],[6,1,1580191566],[7,0,1655355987],[8,4,1580191566],[9,3,1580191566],[10,39,1655358356],[12,1,1580280953],[13,3,1580191566],[14,0,1580191566],[16,0,1594455141]]
6 | 7488 | 1 | 273 | 1904288297 | 0 | 7 | (20365,6) | 32771 | 11010 | 24
| | |
H\032\311\005\000\000\000\000\3007\252\326`\203\002\000\244\003\000\000[[1,1,1638359432],[2,8,1654075863],[3,0,1654075895],[4,1,1411368668],[6,0,1643527342],[7,0,1654075863],[8,4,1586241825],[9,3,1586241825],[10,48,1653909745],[12,1,1586241825],[13,3,1586241825],[14,0,1586241825],[16,30,1586241825]]
7 | 32 | 2 | 0 | | | | | | |
| | |
8 | 4296 | 1 | 204 | 1965224938 | 1965224948 | 6 | (20365,12) | 57347 | 8450 | 24
| | |
\016\301\177\005\000\000\000\000\300\271\261\265\215\204\002\000\220\002\000\000[[1,0,1655367415],[2,0,1655368111],[3,0,1655316514],[4,1,1345141307],[7,0,1655368111],[10,46,1655368114],[12,0,1655316561],[13,2,1602489365],[16,19,1655368127]]
9 | 6 | 2 | 0 | | | | | | |
| | |
10 | 29 | 2 | 0 | | | | | | |
| | |
11 | 7208 | 1 | 273 | 1965056934 | 1965056934 | 7 | (20365,11) | 32771 | 10642 | 24
| | |
\377\232\310\005\000\000\000\000\200\271\334q\214\204\002\000\244\003\000\000[[1,0,1655361162],[2,5,1655361996],[3,0,1655362694],[4,1,1358874131],[6,0,1655055632],[7,0,1655361158],[8,4,1580366031],[9,3,1639915217],[10,46,1655361947],[12,1,1653134366],[13,3,1580366031],[14,0,1580366031],[16,30,1580366031]]
12 | 4088 | 1 | 204 | 1965224948 | 1965224948 | 6 | (20365,12) | 32771 | 8338 | 24
| | |
\016\301\177\005\000\000\000\000\000\374\300\265\215\204\002\000\220\002\000\000[[1,0,1655367415],[2,0,1655368111],[3,0,1655316514],[4,1,1345141307],[7,0,1655368111],[10,46,1655368114],[12,0,1655316561],[13,2,1602489365],[16,18,1655368128]]
13 | 7160 | 1 | 43 | 1957335917 | 1964638027 | 1 | (20365,13) | 8195 | 2498 | 24
| | | \310jo\007\000\000\000\000\200\027k\264b\204\002\000\007[]
14 | 16 | 2 | 0 | | | | | | |
| | |
15 | 0 | 0 | 0 | | | | | | |
| | |
16 | 6880 | 1 | 273 | 1965223751 | 1965223751 | 8 | (20365,16) | 32771 | 10642 | 24
| | |
\347\243\236\005\000\000\000\000\300h\214\263\215\204\002\000\244\003\000\000[[1,0,1655240440],[2,9,1655368091],[3,0,1654972875],[4,1,1397413995],[6,0,1654175940],[7,0,1655368091],[8,4,1582557307],[9,3,1582557307],[10,47,1654197675],[12,1,1629279335],[13,3,1587923475],[14,0,1582557307],[16,30,1648131697]]
17 | 0 | 0 | 0 | | | | | | |
| | |
18 | 0 | 0 | 0 | | | | | | |
| | |
19 | 0 | 0 | 0 | | | | | | |
| | |
20 | 0 | 0 | 0 | | | | | | |
| | |
21 | 0 | 0 | 0 | | | | | | |
| | |
22 | 0 | 0 | 0 | | | | | | |
| | |
23 | 0 | 0 | 0 | | | | | | |
| | |
24 | 0 | 0 | 0 | | | | | | |
| | |
25 | 6640 | 1 | 238 | 1490075747 | 0 | 5 | (20365,25) | 32771 | 11010 | 24
| | |
\373\021J\007\000\000\000\000\300\207\236\331|z\002\000\030\003\000\000[[1,0,1644300599],[2,10,1643357339],[3,4,1643109769],[6,0,1643357339],[7,0,1643365735],[8,4,1643357339],[9,3,1643357339],[12,0,1643360113],[13,3,1643108769],[14,0,1643357339],[16,30,1643357339]]
26 | 0 | 0 | 0 | | | | | | |
| | |
27 | 6360 | 1 | 273 | 1962697004 | 1962697004 | 7 | (20365,27) | 32771 | 10642 | 24
| | |
\3342\251\005\000\000\000\000\200\020\230\276x\204\002\000\244\003\000\000[[1,1,1655183659],[2,9,1655278082],[3,4,1655217231],[4,1,1646965839],[6,0,1655275445],[7,0,1655278082],[8,4,1636371136],[9,3,1636371136],[10,46,1655182361],[12,1,1648303165],[13,3,1636371136],[14,0,1636371136],[16,30,1636371136]]
28 | 0 | 0 | 0 | | | | | | |
| | |
29 | 6080 | 1 | 273 | 1964885708 | 1964885708 | 9 | (20365,29) | 32771 | 10642 | 24
| | |
y\031\300\006\000\000\000\000\300=&.\213\204\002\000\244\003\000\000[[1,0,1655356874],[2,0,1655356785],[3,3,1655356828],[4,0,1607932548],[6,0,1593701110],[7,0,1655356785],[8,4,1589636764],[9,3,1589636764],[10,45,1655357263],[12,0,1655223006],[13,2,1633191479],[14,0,1589636764],[16,22,1655285002]]
30 | 5 | 2 | 0 | | | | | | |
| | |
31 | 6032 | 1 | 43 | 1529256935 | 0 | 1 | (20365,31) | 3 | 2818 | 24
| | | z\315N\007\000\000\000\000\000\207\037)0{\002\000\007[]
32 | 5752 | 1 | 273 | 1964761973 | 1964761973 | 7 | (20365,32) | 32771 | 10642 | 24
| | |
\377\020\322\005\000\000\000\000\000d\303+\212\204\002\000\244\003\000\000[[1,1,1635340514],[2,8,1655352336],[3,0,1655352928],[4,1,1324578627],[6,0,1652362638],[7,0,1655352336],[8,4,1580283680],[9,3,1580283680],[10,49,1655321899],[12,1,1623273443],[13,3,1580283680],[14,0,1580283680],[16,30,1580283680]]
33 | 0 | 0 | 0 | | | | | | |
| | |
34 | 0 | 0 | 0 | | | | | | |
| | |
35 | 5472 | 1 | 273 | 1962530750 | 1962530750 | 9 | (20365,35) | 32771 | 10642 | 24
| | |
"\273o\005\000\000\000\000\200\024\276\206w\204\002\000\244\003\000\000[[1,0,1655272251],[2,6,1655272847],[3,0,1655110969],[4,1,1364884674],[6,0,1649045620],[7,0,1655272847],[8,4,1621844466],[9,3,1621844466],[10,49,1655272850],[12,1,1635832362],[13,3,1621821962],[14,0,1621844466],[16,30,1621844466]]
36 | 5264 | 1 | 204 | 1965224703 | 1965224917 | 6 | (20365,38) | 57347 | 9474 | 24
| | |
\016\301\177\005\000\000\000\000\300\2477\265\215\204\002\000\220\002\000\000[[1,0,1655367415],[2,0,1655368111],[3,0,1655316514],[4,1,1345141307],[7,0,1655368111],[10,46,1655368114],[12,0,1655316561],[13,2,1602489365],[16,23,1655368119]]
37 | 5128 | 1 | 129 | 1964847563 | 1964847563 | 7 | (20365,37) | 32771 | 10642 | 24
| | |
\201A\250\005\000\000\000\000\300\215\253\345\212\204\002\000\263[[1,0,1655355920],[2,6,1655355999],[3,0,1655356047],[7,0,1655355997],[10,46,1655356007]]
38 | 4920 | 1 | 204 | 1965224917 | 1965224925 | 6 | (20365,39) | 57347 | 9474 | 24
| | |
\016\301\177\005\000\000\000\000\200w\242\265\215\204\002\000\220\002\000\000[[1,0,1655367415],[2,0,1655368111],[3,0,1655316514],[4,1,1345141307],[7,0,1655368111],[10,46,1655368114],[12,0,1655316561],[13,2,1602489365],[16,22,1655368126]]
39 | 4712 | 1 | 204 | 1965224925 | 1965224930 | 6 | (20365,2) | 57347 | 9474 | 24
| | |
\016\301\177\005\000\000\000\000\200w\242\265\215\204\002\000\220\002\000\000[[1,0,1655367415],[2,0,1655368111],[3,0,1655316514],[4,1,1345141307],[7,0,1655368111],[10,46,1655368114],[12,0,1655316561],[13,2,1602489365],[16,21,1655368126]]
40 | 0 | 0 | 0 | | | | | | |
| | |
41 | 0 | 0 | 0 | | | | | | |
| | |
42 | 0 | 0 | 0 | | | | | | |
| | |
43 | 0 | 0 | 0 | | | | | | |
| | |
44 | 0 | 0 | 0 | | | | | | |
| | |
45 | 35 | 2 | 0 | | | | | | |
| | |
46 | 0 | 0 | 0 | | | | | | |
| | |
47 | 37 | 2 | 0 | | | | | | |
| | |
48 | 11 | 2 | 0 | | | | | | |
| | |
49 | 0 | 0 | 0 | | | | | | |
| | |
50 | 0 | 0 | 0 | | | | | | |
| | |
51 | 36 | 2 | 0 | | | | | | |
| | |
(51 rows)
--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
"Always assume the user will do much worse than the stupidest thing
you can imagine." (Julien PUYDT)
A bit more info - after upgrading to 14.4 + REINDEX, those errors still persist. So, even more likely to be something with amcheck. Best regards, Sergey Aleynikov