Hi,
On 2022-08-03 14:23:34 +0000, PG Bug reporting form wrote:
> I think some process may have loaded btree metapage (page 0) into shared
> buffers prior the end of _bt_load. In this case, the error is reproduced
> (14.4, 14 STABLE, HEAD):
>
> create extension pageinspect;
> create table test as select generate_series(1,1e4) as id;
> create index test_id_idx on test(id);
> # prepare gdb for this backend with breakpoint on _bt_uppershutdown
> reindex index concurrently test_id_idx ;
Worth noting that this doesn't even require reindex concurrently, it's also an
issue for CIC.
The problem basically is that once the first non-meta page from the btree is
written (e.g. _bt_blwritepage() calling smgrextend()), concurrent sessions can
read in the metapage (and potentially other pages that are also zero filled)
into shared_buffers. at the end of _bt_uppershutdown() we'll write the
metapage to disk, bypassing shared buffers. And boom, the all-zeroes version
read into memory earlier is suddenly out of date.
The easiest fix is likely to force all buffers to be forgotten at the end of
index_concurrently_build() or such. I don't immediately see a nicer way to fix
this, we can't just lock the new index relation exclusively.
We could of course also stop bypassing s_b for CIC/RIC, but that seems mighty
invasive for a bugfix.
Greetings,
Andres Freund