Re: decoupling table and index vacuum

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: decoupling table and index vacuum
Дата
Msg-id CA+TgmoYNbEsoW36WJONxG4xRLQmDdPQaJMG9WBDWgGSyHkg32w@mail.gmail.com
обсуждение исходный текст
Ответ на Re: decoupling table and index vacuum  (Peter Geoghegan <pg@bowt.ie>)
Ответы Re: decoupling table and index vacuum  (Peter Geoghegan <pg@bowt.ie>)
Список pgsql-hackers
On Tue, Feb 8, 2022 at 12:50 PM Peter Geoghegan <pg@bowt.ie> wrote:
> > It's not clear to me that we have enough information to make good
> > decisions about which indexes to vacuum and which indexes to skip.
>
> What if "extra vacuuming, not skipping vacuuming" was not just an
> abstract goal, but an actual first-class part of the implementation,
> and the index AM API? Then the question we're asking the index/index
> AM is no longer "Do you [an index] *not* require index vacuuming, even
> though you are entitled to it according to the conventional rules of
> autovacuum scheduling?". The question is instead more like "Could you
> use an extra, early VACUUM?".
>
> if we invert the question like this then we have something that makes
> more sense at the index AM level, but requires significant
> improvements at the level of autovacuum scheduling. On the other hand
> I think that you already need to do at least some work in that area.

Right, that's why I asked the question. If we're going to ask the
index AM whether it would like to be vacuumed right now, we're going
to have to put some logic into the index AM that knows how to answer
that question. But if we don't have any useful statistics that would
let us answer the question correctly, then we have problems.

While I basically agree with everything that you just wrote, I'm
somewhat inclined to think that the question is not best phrased as
either extra-vacuum or skip-a-vacuum. Either of those supposes a
normative amount of vacuuming from which we could deviate in one
direction or the other. I think it would be better to phrase it in a
way that doesn't make such a supposition. Maybe something like: "Hi,
we are vacuuming the heap right now and we are also going to vacuum
any indexes that would like it, and does that include you?"

The point is that it's a continuum. If we decide that we're asking the
index "do you want extra vacuuming?" then that phrasing suggests that
you should only say yes if you really need it. If we decide we're
asking the index "can we skip vacuuming you this time?" then the
phrasing suggests that you should not feel bad about insisting on a
vacuum right now, and only surrender your claim if you're sure you
don't need it. But in reality, no bias either way is warranted. It is
either better that this index should be vacuumed right now, or better
that it should not be vacuumed right now, and whichever is better
should be what we choose to do.

To expand on that just a bit, if I'm a btree index and someone asks me
"can we skip vacuuming you this time?" I might say "return dead_tups <
tiny_amount" and if they ask me "do you want extra vacuuming" I might
say "return dead_tups > quite_large_amount". But if they ask me
"should we vacuum you now?" then I might say "return dead_tups >
moderate_amount" which feels like the correct thing here.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: David Steele
Дата:
Сообщение: Re: is the base backup protocol used by out-of-core tools?
Следующее
От: Robert Haas
Дата:
Сообщение: Re: is the base backup protocol used by out-of-core tools?