BUG #17401: REINDEX TABLE CONCURRENTLY creates a race condition on a streaming replica

Поиск
Список
Период
Сортировка
От PG Bug reporting form
Тема BUG #17401: REINDEX TABLE CONCURRENTLY creates a race condition on a streaming replica
Дата
Msg-id 17401-9df851bb16dde397@postgresql.org
обсуждение исходный текст
Ответы Re: BUG #17401: REINDEX TABLE CONCURRENTLY creates a race condition on a streaming replica  (Peter Geoghegan <pg@bowt.ie>)
Re: BUG #17401: REINDEX TABLE CONCURRENTLY creates a race condition on a streaming replica  (Andrey Borodin <x4mmm@yandex-team.ru>)
Список pgsql-bugs
The following bug has been logged on the website:

Bug reference:      17401
Logged by:          Ben Chobot
Email address:      bench@silentmedia.com
PostgreSQL version: 12.9
Operating system:   Linux (Ubuntu)
Description:

This bug is is almost identical to BUG #17389, which I filed blaming
pg_repack; however, further testing shows the same symptoms using vanilla
REINDEX TABLE CONCURRENTLY.

1. Put some data in a table with a single btree index:
create table public.simple_test (id int primary key);
insert into public.simple_test(id) (select generate_series(1,1000));

2. Set up streaming replication to a secondary db.

3. In a loop on the primary, concurrently REINDEX that table:
while `true`; do psql -tAc "select now(),relfilenode from pg_class where
relname='simple_test_pkey'" >> log; psql -tAc "reindex table concurrently
public.simple_test"; done

4. In a loop on the secondary, have psql query the secondary db for an
indexed value of that table:
while `true`; do psql -tAc "select count(*) from simple_test where id='3';
select relfilenode from pg_class where relname='simple_test_pkey'" || break;
done; date

With those 4 steps, the client on the replica will reliably fail to open the
OID of the index within 30 minutes of looping. ("ERROR:  could not open
relation with OID 6715827") When we run the same client loop on the primary
instead of the replica, or if we reindex without the CONCURRENTLY clause,
then the client loop will run for hours without fail, but neither of those
workarounds are options for us in production.

Like I said before, this isn't a new problem - we've seen it since at least
9.5 - but pre-12 we saw it using pg_repack, which is an easy (and
reasonable) scapegoat. But now that we've upgraded to 12 and are still
seeing it using vanilla concurrent reindexing, it seems more clear this is
an actual postgres bug?


В списке pgsql-bugs по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: BUG #17391: While using --with-ssl=openssl and PG_TEST_EXTRA='ssl' options, SSL tests fail on OpenBSD 7.0
Следующее
От: Peter Geoghegan
Дата:
Сообщение: Re: BUG #17401: REINDEX TABLE CONCURRENTLY creates a race condition on a streaming replica