Обсуждение: Bug in amcheck?
Hi hackers.
We see the following error reported by amcheck (I have added dump of
opaque) when it interleaves with autovacuum and cancel pt:
ERROR: mismatch between parent key and child high key in index
"pg_attribute_relid_attnam_index"
DETAIL: Target block=274, target opaque->flags=0, child block=427,
child opaque=11, target page lsn=1/484A8FC8.
CONTEXT: SQL statement "SELECT bt_index_parent_check(indexrelid, true,
true) from pg_index"
So child has BTP_HALF_DEAD bit set.
Autovacuum is interrupted in this place in _bt_pagedel:
/*
* Check here, as calling loops will have locks held, preventing
* interrupts from being processed.
*/
CHECK_FOR_INTERRUPTS();
Reproducing it is not so easy.
First of all I added sleep here:
/*
* Check here, as calling loops will have locks held, preventing
* interrupts from being processed.
*/
pg_usleep(10000);
CHECK_FOR_INTERRUPTS();
Then I create two procedures:
create or replace procedure create_tables(tables integer, partitions
integer) as $$
declare
i integer;
j integer;
begin
for i in 1..tables
loop
execute 'DROP TABLE IF EXISTS t_' || i;
execute 'CREATE TABLE t_' || i || '(pk integer) partition by
range (pk)';
for j in 1..partitions
loop
execute 'create table p_'||i||'_'||j||' partition of
t_'||i||' for values from ('||j||') to ('||(j + 1)||')';
end loop;
execute 'insert into t_'||i||' values
(generate_series(1,'||partitions||'))';
end loop;
end;
$$ language plpgsql;
and
create or replace procedure run_amcheck() as $$
begin
loop
if (select count(*) from pg_stat_activity where
backend_type='autovacuum worker') > 0
then
raise notice 'Run amcheck!';
perform bt_index_parent_check(indexrelid, true, true) from
pg_index;
end if;
perform pg_sleep(1);
end loop;
end;
$$ language plpgsql;
Then I run concurrently run_amcheck()
and the following script for pgbench:
call create_tables(2,1000);
select pg_sleep(2);
If the problem is not reproduced, then cancel run_amcheck() and restart
it once again.
Backtrace (pg16) is the following:
* frame #0: 0x00000001017b6aac
amcheck.dylib`bt_child_highkey_check(state=0x000000010c846318,
target_downlinkoffnum=37, loaded_child="\U00000001", target_level=1) at
verify_nbtree.c:2146:23
frame #1: 0x00000001017b7fd8
amcheck.dylib`bt_child_check(state=0x000000010c846318,
targetkey=0x000000013c01c448, downlinkoffnum=37) at verify_nbtree.c:2262:2
frame #2: 0x00000001017b5f4c
amcheck.dylib`bt_target_page_check(state=0x000000010c846318) at
verify_nbtree.c:1623:4
frame #3: 0x00000001017b3908
amcheck.dylib`bt_check_level_from_leftmost(state=0x000000010c846318,
level=(level = 1, leftmost = 3, istruerootlevel = false)) at
verify_nbtree.c:859:3
frame #4: 0x00000001017b24e8
amcheck.dylib`bt_check_every_level(rel=0x0000000140074f18,
heaprel=0x0000000130070148, heapkeyspace=true, readonly=true,
heapallindexed=true, rootdescend=true) at verify_nbtree.c:603:13
frame #5: 0x00000001017b198c
amcheck.dylib`bt_index_check_internal(indrelid=2674, parentcheck=true,
heapallindexed=true, rootdescend=true) at verify_nbtree.c:362:3
frame #6: 0x00000001017b1a78
amcheck.dylib`bt_index_parent_check(fcinfo=0x000000010c83b040) at
verify_nbtree.c:242:2
I wonder if we should add P_ISHALFDEAD(opaque) for child page?
Hello! > I wonder if we should add P_ISHALFDEAD(opaque) for child page? I am not a btree expert, but things I was able to find so far: In commit d114cc538715e14d29d6de8b6ea1a1d5d3e0edb4 next check is added: > bt_child_highkey_check(state, downlinkoffnum, > child, topaque->btpo_level); At the same time there is a comment below: > * We go ahead with our checks if the child page is half-dead. It's safe > * to do so because we do not test the child's high key, so it does not > * matter that the original high key will have been replaced by a dummy > * truncated high key within _bt_mark_page_halfdead(). All other page > * items are left intact on a half-dead page, so there is still something > * to test. So, yes, it looks like we need to skip the child's high key test for half-dead pages. BWT, have you tried to create an injection_point-based reproducer? Best regards, Mikhail.
On 02/11/2025 2:27 PM, Mihail Nikalayeu wrote: > Hello! > >> I wonder if we should add P_ISHALFDEAD(opaque) for child page? > I am not a btree expert, but things I was able to find so far: > > In commit d114cc538715e14d29d6de8b6ea1a1d5d3e0edb4 next check is added: > >> bt_child_highkey_check(state, downlinkoffnum, >> child, topaque->btpo_level); > At the same time there is a comment below: > >> * We go ahead with our checks if the child page is half-dead. It's safe >> * to do so because we do not test the child's high key, so it does not >> * matter that the original high key will have been replaced by a dummy >> * truncated high key within _bt_mark_page_halfdead(). All other page >> * items are left intact on a half-dead page, so there is still something >> * to test. > So, yes, it looks like we need to skip the child's high key test for > half-dead pages. > > BWT, have you tried to create an injection_point-based reproducer? > > Best regards, > Mikhail. Hello Mikhail, Thank you very much for looking at this issue. And I am very sorry for delay with answer. Unfortunately I was not able to reproduce the problem for the latest Postgres: neither with injection points, neither with my original approach with sleeps. Originally I investigated the customer's problem with PG16. And have reproduced it for pg16,. I checked that relevant amcheck code was not changed since pg16, so I thought that the problem takes place for all Postgres versions. But looks like it is not true.
Hello! > Originally I investigated the customer's problem with PG16. And have > reproduced it for pg16,. I checked that relevant amcheck code was not > changed since pg16, so I thought that the problem takes place for all > Postgres versions. But looks like it is not true. I think it is still broken, but with less probability. Have you tried injection points on v16? Such a test case will make things much more clear. Also, I added Alexander to CC (he is author of bt_child_highkey_check) - maybe the issue is easily understandable for him. Best regards, Mikhail.