Re: IRe: BUG #16792: silent corruption of GIN index resulting in SELECTs returning non-matching rows

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: IRe: BUG #16792: silent corruption of GIN index resulting in SELECTs returning non-matching rows
Дата
Msg-id CAH2-Wzm8gQYhkFPMYgoQcfA=vz7xjd85ZSfXdcHqmx9wFddjSg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: IRe: BUG #16792: silent corruption of GIN index resulting in SELECTs returning non-matching rows  (Heikki Linnakangas <hlinnaka@iki.fi>)
Список pgsql-bugs
On Sun, Jul 25, 2021 at 12:08 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> I'll work on a patch to add more sanity checks to the GIN code when it
> traverses the tree, to catch the case that it accidentally steps on a
> wrong kind of a page (I'm pretty busy next week, so might not get to
> that until the week after though). I don't think that will help here,
> but who knows, and at least we can rule out some kinds of bugs.

I just noticed that among call sites to ginInsertCleanup() located in
ginvacuum.c, only one caller specifies fill_fsm=true: The call made
when an autoanalyze (i.e. av worker that just does an ANALYZE) is run.
In other words, a call through the special analyze_only path [1].
Note, in particular, that a plain VACUUM, or a plain VACUUM ANALYZE
will always specify fill_fsm=false, regardless of any of the details.
This seems totally contradictory to me.

(***Thinks some more***)

Actually, that's not quite right -- it's not contradictory once you
realize that fill_fsm=true is an extra special path, originally
designed just for the analyze_only path. That's easy to see once you
look at commit dc943ad952 from 2015. Its specific goal was to allow
this special analyze_only path to recycle pages. We need fill_fsm=true
because the analyze_only path won't scan the whole index again at the
end of ginvacuumcleanup(). (Actually, why even scan it in the similar
cleanup-only VACUUM case? Ugh, that's not worth getting into now.)

In summary: Although I always suspected the fill_fsm=true ShiftList()
path here, it's even easier to suspect it now. Because I see now that
it hardly ever gets used inside autovacuum worker processes, though of
course it is possible. It's probably a matter of luck as to whether
you hit the analyze_only + fill_fsm=true ShiftList() path. That might
well have contributed to our difficulty in recreating the bugs here.

This is such a mess. I'm not sure that this insight will be news to
you, Heikki. Perhaps I'm just venting about the confused state of the
code here, following a recent unpleasant reminder (again, see [1]).

[1] https://postgr.es/m/CAH2-WzkjrK556enVtFLmyXEdw91xGuwiyZVep2kp5yQT_-3JDg@mail.gmail.com

--
Peter Geoghegan



В списке pgsql-bugs по дате отправления:

Предыдущее
От: Masahiko Sawada
Дата:
Сообщение: Re: Inconsistent behavior of pg_dump/pg_restore on DEFAULT PRIVILEGES
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Inconsistent behavior of pg_dump/pg_restore on DEFAULT PRIVILEGES