visibility map corruption
От | Floris Van Nee |
---|---|
Тема | visibility map corruption |
Дата | |
Msg-id | 87da83168c644fd9aae38f546cc70295@opammb0562.comp.optiver.com обсуждение исходный текст |
Ответы |
Re: visibility map corruption
(Peter Geoghegan <pg@bowt.ie>)
|
Список | pgsql-hackers |
Hi hackers, We recently ran into an issue where the visibility map of a relation was corrupt, running Postgres 12.4. The error we'd getwhen running a SELECT * from this table is: could not access status of transaction 3704450152 DETAIL: Could not open file "pg_xact/0DCC": No such file or directory. On the lists I could find several similar reports, but corruption like this could obviously have a very wide range of rootcauses.. see [1] [2] [3] for example - not all of them have their root cause known. This particular case was similar to reported cases above, but also has some differences. The following query returns ~21.000 rows, which indicates something inconsistent between the visibility map and the pd_all_visibleflag on the page: select * from pg_check_frozen('tbl'); Looking at one of the affected pages with pageinspect: =# SELECT lp,lp_off,lp_flags,lp_len,t_xmin,t_xmax,t_field3,t_ctid,t_infomask2,t_infomask,t_hoff,t_oid FROM heap_page_items(get_raw_page('tbl',726127)); ┌────┬────────┬──────────┬────────┬────────────┬────────────┬──────────┬────────────┬─────────────┬────────────┬────────┬───────┐ │ lp │ lp_off │ lp_flags │ lp_len │ t_xmin │ t_xmax │ t_field3 │ t_ctid │ t_infomask2 │ t_infomask │ t_hoff │t_oid │ ├────┼────────┼──────────┼────────┼────────────┼────────────┼──────────┼────────────┼─────────────┼────────────┼────────┼───────┤ │ 1 │ 6328 │ 1 │ 1864 │ 3704450155 │ 3704450155 │ 1 │ (726127,1) │ 249 │ 8339 │ 56 │ ∅ │ │ 2 │ 4464 │ 1 │ 1864 │ 3704450155 │ 3704450155 │ 1 │ (726127,2) │ 249 │ 8339 │ 56 │ ∅ │ │ 3 │ 2600 │ 1 │ 1864 │ 3704450155 │ 3704450155 │ 1 │ (726127,3) │ 249 │ 8339 │ 56 │ ∅ │ │ 4 │ 680 │ 1 │ 1920 │ 3704450155 │ 3704450155 │ 1 │ (726127,4) │ 249 │ 8339 │ 56 │ ∅ │ └────┴────────┴──────────┴────────┴────────────┴────────────┴──────────┴────────────┴─────────────┴────────────┴────────┴───────┘ t_infomask shows that HEAP_XMIN_COMMITTED and HEAP_XMIN_INVALID bits are both unset. This pg_visibility() call shows the inconsistency between VM and page, with PD_ALL_VISIBLE=false =# select * from pg_visibility('tbl', 726127); ┌─────────────┬────────────┬────────────────┐ │ all_visible │ all_frozen │ pd_all_visible │ ├─────────────┼────────────┼────────────────┤ │ t │ t │ f │ └─────────────┴────────────┴────────────────┘ Looking at other pages show the same information. What's interesting is that out of the affected tuples returned by pg_check_frozen, over 99% belong to 1 transaction (3704450155as seen above) and the remaining few are from one other transaction that occurred at roughly the same time. I find it hard to believe that this is due to some random bit flipping, because many pages are affected, but the "incorrect"ones are in total only from two specific transactions which occurred at roughly the same time. There were alsono server crashes or other known failures around the time of this transaction. I'm not ruling out any other kind of failurestill, but I also cannot really explain how this could have happened. The server has PG12.4 with full_page_writes=on, data_checksums=off. It's a large analytics database. The VM inconsistenciesalso occur on the streaming replicas. I realize these cases are pretty rare and hard to debug, but I wanted to share the information I found so far here for reference.Maybe someone has an idea what occurred, or maybe someone in the future finds it useful when he finds somethingsimilar. I have no idea how the inconsistency between VM and PD_ALL_VISIBLE started - from looking through the code I can't reallyfind any way how this could occur. However, for it to lead to the problem described here, I believe there should be*no* SELECT that touches that particular page after the insert/update transaction and before the transaction log gets truncated.If the page is read before the transaction log gets truncated, then the hint bit HEAP_XMIN_COMMITTED will get setand future reads will succeed regardless of tx log truncation. One of the replica's had this happen to it: the affectedpages are identical to the primary except that the HEAP_XMIN_COMMITTED flag is set (note that the VM inconsistencyis still there on the replica though: PD_ALL_VISIBLE=false even though VM shows that all_frozen=all_visible=true).But I can query these rows on the replica without issues, because it doesn't check the tx logwhen it sees that HEAP_XMIN_COMMITTED is set. -Floris [1] https://postgrespro.com/list/thread-id/2422376 [2] https://postgrespro.com/list/thread-id/2501800 [3] https://postgrespro.com/list/thread-id/2321949
В списке pgsql-hackers по дате отправления: