Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
Дата
Msg-id 20220524222433.ibl6dgkc6jrriska@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY  (Greg Stark <stark@mit.edu>)
Ответы Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY  (Michael Paquier <michael@paquier.xyz>)
Список pgsql-bugs
Hi,

On 2022-05-24 17:11:12 -0400, Greg Stark wrote:
> On Tue, 24 May 2022 at 15:02, Andres Freund <andres@anarazel.de> wrote:
> >
> > Basically:
> >
> > 1) S1 builds index in phase 2
> > 2) S2 inserts tuple t1 (not in the index built in 1), since it's inserted
>    after that)
> > 3) S2 hot updates tuple t1->t2
> 
> Not that it matters but is this step even necessary?

I think it is, but there might be other recipes reproducing the problem.


> > 4) S1 sets PROC_IN_SAFE_IC, builds snapshot, starts validation scan (phase 3)
> > 5) S2 hot updates tuple t2->t3
> 
> That seems like the key observation. But I wonder if it's even the
> only flow where this could be an issue. What happens if t2 is deleted,
> can it get pruned away completely?

Yes it could, but afaics that'd be fine, because then there's no missing index
entry. And the index should only be marked valid once all older snapshots have
ended.


> > 6) Either S1 or S2 performs hot pruning, redirecting t1 to t3, this is only
> >    possible because PROC_IN_SAFE_IC caused S2's ->xmin to be ignored
> 
> Or presumably any other transaction.

Right.


> But ... does the update to t2->t3 not automatically trigger pruning anyways?

We don't prune during updates right now (but do when fetching the row to
update) - I think that's bad, but it's how it is.

When you say "automatically" - do you mean that it'd happen unconditionally,
independent of the horizon? It shouldn't...


> > 7) S2 checks t1->t3, finds that t3 is too new for the snapshot, doesn't create
> >    an index entry
> 
> Just to be clear, it would normally have created an index entry (for
> the whole HOT chain) because t2 is in the recheck snapshot and
> therefore the whole HOT chain wasn't in the initial snapshot. I'm a
> little confused here.

Hm? Why / where would we have done that? It's a HOT update, so the UPDATE
doesn't create an index entry. And the validate scan won't see the HOT chain
because t2 has been pruned away and t3 is too new.

What "recheck snapshot" are you referring to? The one passed to
validate_index()?


> > 8) corruption
> 
> Aside from amcheck I wonder if we can come up with any way for users
> to tell whether their index is affected or at risk. Like, is there a
> way to tell from catalog entries if an index was created with CIC?

Not reliably, afaik. indcheckxmin won't ever be set for a CIC index IIRC, but
it's not reliably set for a non-CIC index.

Greetings,

Andres Freund



В списке pgsql-bugs по дате отправления:

Предыдущее
От: PG Bug reporting form
Дата:
Сообщение: BUG #17497: Data directory has been changed to default
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: BUG #17492: error MSB4126: The specified solution configuration "Release|arm64" is invalid