Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
Дата
Msg-id 20220524190133.j6ee7zh4f5edt5je@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY  (Andrey Borodin <x4mmm@yandex-team.ru>)
Ответы Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY  (Greg Stark <stark@mit.edu>)
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY  (Andrey Borodin <x4mmm@yandex-team.ru>)
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Список pgsql-bugs
Hi,

On 2022-05-24 23:38:07 +0500, Andrey Borodin wrote:
>
>
> > On 24 May 2022, at 23:15, Andres Freund <andres@anarazel.de> wrote:
> >
> > With fsync=on, it's much harder to reproduce.
> That exaplains why it's easier to reproduce on MacOS: it seem it ignores fsync.

Yea, one needs wal_sync_method=fsync_writethrough or such :/


> > On 24 May 2022, at 23:15, Andres Freund <andres@anarazel.de> wrote:
> >
> > I suspect the problem might be related to pruning done during the validation
> > scan. Once PROC_IN_SAFE_IC is set, the backend itself will not preserve tids
> > its own snapshot might need. Which will wreak havoc during the validation
> > scan.
>
> I observe that removing PROC_IN_SAFE_IC for index_validate() fixes tests.
> But why it's not a problem for index_build() scan?

I now suspect it's a problem for both, just more visible for index_validate().


> And I do not understand why it's a problem that tuple is pruned during the scan... How does this "wreak havoc"
happen?

Basically snapshots don't work anymore. If PROC_IN_SAFE_IC is set, that
backend is ignored for the horizon computation for snapshots / on-access HOT
pruning. Which means that rows that are visible to the snapshot can be pruned
away.

One might think that could be safe, after all the row is invisible to all
other backends. The problem is that the validation scan won't see *newer* rows
either, since they're not visible to the snapshot either. And if the new row
version is a HOT tuple, it won't have made an index entry on its own. Boom,
corruption.

Basically:

1) S1 builds index in phase 2
2) S2 inserts tuple t1 (not in the index built in 1), since it's inserted
   after that)
3) S2 hot updates tuple t1->t2
4) S1 sets PROC_IN_SAFE_IC, builds snapshot, starts validation scan (phase 3)
5) S2 hot updates tuple t2->t3
6) Either S1 or S2 performs hot pruning, redirecting t1 to t3, this is only
   possible because PROC_IN_SAFE_IC caused S2's ->xmin to be ignored
7) S2 checks t1->t3, finds that t3 is too new for the snapshot, doesn't create
   an index entry
8) corruption


Greetings,

Andres Freund



В списке pgsql-bugs по дате отправления:

Предыдущее
От: PG Bug reporting form
Дата:
Сообщение: BUG #17496: to_char function resets if interval exceeds 23 hours 59 minutes
Следующее
От: Jeff Janes
Дата:
Сообщение: Re: BUG #17494: High demand for displacement sort