Re: Corrupt index stopping autovacuum system wide

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: Corrupt index stopping autovacuum system wide
Дата
Msg-id CAH2-Wzm9boEusD8pyz8S2eXey01VC6PZmqAegOgUO+yRCQgTiA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Corrupt index stopping autovacuum system wide  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Corrupt index stopping autovacuum system wide  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Список pgsql-general
On Wed, Jul 17, 2019 at 10:27 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Right, you're eventually going to get to a forced shutdown if vacuum never
> succeeds on one table; no question that that's bad.

It occurs to me that we use operator class/insertion scankey
comparisons within page deletion, to relocate a leaf page that looks
like a candidate for deletion. Despite this, README.hot claims:

"Standard vacuuming scans the indexes to ensure all such index entries
are removed, amortizing the index scan cost across as many dead tuples
as possible; this approach does not scale down well to the case of
reclaiming just a few tuples.  In principle one could recompute the
index keys and do standard index searches to find the index entries,
but this is risky in the presence of possibly-buggy user-defined
functions in functional indexes.  An allegedly immutable function that
in fact is not immutable might prevent us from re-finding an index
entry"

That probably wasn't the problem in Aaron's case, but it is worth
considering as a possibility.

> My concern here is
> that if we have blinders on to the extent of only processing that one
> table or DB, we're unnecessarily allowing bloat to occur in other tables,
> and causing that missed vacuuming work to pile up so that there's more of
> it to be done once the breakage is cleared.  If the DBA doesn't notice the
> problem until getting into a forced shutdown, that is going to extend his
> outage time --- and, in a really bad worst case, maybe make the difference
> between being able to recover at all and not.

The comment about "...any db at risk of Xid wraparound..." within
do_start_worker() hints at such a problem.

Maybe nbtree VACUUM should do something more aggressive than give up
when there is a "failed to re-find parent key" or similar condition.
Perhaps it would make more sense to make the index inactive (for some
value of "inactive") instead of just complaining. That might be the
least worst option, all things considered.

--
Peter Geoghegan



В списке pgsql-general по дате отправления:

Предыдущее
От: Sonam Sharma
Дата:
Сообщение: Re: Change in db size
Следующее
От: Perumal Raj
Дата:
Сообщение: Looking for Postgres upgrade Metrix