Re: autovacuum prioritization

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: autovacuum prioritization
Дата
Msg-id CAH2-Wz=L_1-Tfu=_RyChXGsp3qHoJwcomOAB64uaZJFidnV78g@mail.gmail.com
обсуждение исходный текст
Ответ на Re: autovacuum prioritization  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
> On Wed, Jan 26, 2022 at 10:55 AM Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Jan 25, 2022 at 3:32 PM Peter Geoghegan <pg@bowt.ie> wrote:
> > For example, a
> > page that has 5 dead heap-only tuples is vastly different to a similar
> > page that has 5 LP_DEAD items instead -- and yet our current approach
> > makes no distinction. Chances are very high that if the only dead
> > tuples are heap-only tuples, then things are going just fine on that
> > page -- opportunistic pruning is actually keeping up.
>
> Hmm, I think that's a really good insight. Perhaps we ought to forget
> about counting updates and deletes and instead count dead line
> pointers. Or maybe we can estimate the number of dead line pointers by
> knowing how many updates and deletes there were, as long as we can
> distinguish hot updates from non-HOT updates, which I think we can.

All that we have to go on is a bunch of observations in any case,
though -- the map is not the territory. And so it seems to me that the
sensible thing to do is just embrace that we won't ever really exactly
know what's going on in a given database, at any given time.
Fortunately, we don't really have to know. We should be able to get
away with only having roughly the right idea, by focussing on the few
things that we are sure of -- things like the difference between
LP_DEAD items and dead heap-only tuples, which are invariant to
workload characteristics.

I recently said (on the ANALYZE related thread) that we should be
thinking probabilistically here [1]. Our understanding of the amount
of bloat could very usefully be framed that way. Maybe the model we
use is a probability density function (maybe not formally, not sure).
A PDF has an exact expectation, which for us might be the most
probable number of dead tuples in total in a given table right now
(let's just assume it's still dead tuples, ignoring the problems with
that metric for now).

This is a useful basis for making better decisions by weighing
competing considerations -- which might themselves be another PDF.
Example: For a given table that is approaching the point where the
model says "time to VACUUM", we may very well spend hours, days, or
even weeks approaching the crossover point. The exact expectation
isn't truly special here -- there is actually zero practical reason to
have special reverence for that precise point (with a good model,
within certain reasonable bounds). If our model says that there is
only a noise-level difference between doing a VACUUM on a given table
today, tomorrow, or next week, why not take advantage? For example,
why not do the VACUUM when the system appears to not be busy at all
(typically in the dead of night), just because it'll definitely be
both cheaper in absolute terms (FPIs can be avoided by spreading
things out over multiple checkpoints), and less disruptive?

There are many opportunities like that, I believe. It's hard for me to
suppress the urge to blurt out 17 more ideas like that. What are the
chances that you won't have at least a few real winners among all of
the ideas that everybody will come up with, in the end?

> > if successive ANALYZE operations notice
> > a consistent pattern where pages that had a non-zero number of LP_DEAD
> > items last time now have a significantly higher number, then it's a
> > good idea to err in the direction of more aggressive vacuuming.
> > *Growing* concentrations of LP_DEAD items signal chaos. I think that
> > placing a particular emphasis on pages with non-zero LP_DEAD items as
> > a qualitatively distinct category of page might well make sense --
> > relatively few blocks with a growing number of LP_DEAD items seems
> > like it should be enough to make autovacuum run aggressively.
>
> I think measuring the change over time here might be fraught with
> peril.

I'd say that that depends on how you define the problem we're trying
to solve. If you define the problem as coming up with a significantly
improved statistical model that determines (say) how many dead tuples
there are in the table right now, given a set of observations made by
ANALYZE in the past, then yes, it's fraught with peril. But why would
you define it that way? It seems far easier to improve things by
putting model error and *actual* exposure to real known issues (e.g.
line pointer bloat) front and center.

It doesn't necessarily matter if we're *usually* wrong with a good
model. But with a bad model we may need to consistently get the
correct answer. And so the model that is the most accurate
quantitatively is probably *not* the best available model, all things
considered. Most of the time we shouldn't VACUUM right this second,
and so a model that consists of "return false" is very frequently
correct. But that doesn't mean it's a good model. You get the idea.

> If vacuum makes a single pass over the indexes, it can retire
> as many dead line pointers as we have, or as will fit in memory, and
> the effort doesn't really depend too much on exactly how many dead
> line pointers we're trying to find.

Line pointer bloat is something that displays hysteresis; once it
happens (past some significant threshold) then there is no reversing
the damage. This makes the behavior very non-linear. In other words,
it makes it incredibly hard to model mathematically [2] -- once you
cross a certain hard to define threshold, it's total chaos, even in a
closed well-specified system (i.e. a highly constrained workload),
because you have all these feedback loops.

On top of all that, even with a perfect model we're still forced to
make a go/no-go decision for the entire table, moment to moment. So
even a mythical perfect model runs into the problem that it is
simultaneously much too early and much too late at the level of the
table. Which is even more reason to just focus on not going totally
off the rails, in any particular direction. Note that this includes
going off the rails by vacuuming in a way that's unsustainably
aggressive -- sometimes you have to cut your losses at that level as
well.

There is usually some bigger picture to consider when things do go
wrong -- there is usually some much worse fate that must be avoided.
Like with VACUUM's failsafe. Sure, controlling index bloat is
extremely important. But it's also much less important than keeping
the system online and responsive. That's another level up. (The level
up *after that* is "at least we didn't lose data", or maybe something
about limiting the amount of downtime, not going out of business,
whatever.)

> I feel like my threshold for the number of dead TIDs that ought to
> trigger a vacuum grows as the table gets bigger, capped by how much
> memory I've got.

I thought of another PDF related idea when I read this, without even
trying: we could account for the discontinuity from multiple index
scans in a single VACUUM operation (instead of just one) by erring in
the direction of doing the VACUUM sooner rather than later, when the
model says that doing so will make very little difference in terms of
extra costs incurred (extra costs from vacuuming sooner rather than
later, conservatively assuming that our concern about TIDs not fitting
in memory is basically unfounded).

> But I don't feel like the rate at which it's changing
> necessarily matters. Like if I create a million dead line pointers
> really quickly, wait a month, and then create another million dead
> line pointers, I feel like I want the system to respond just as
> aggressively as if the month-long delay were omitted.
>
> Maybe my feelings are wrong here. I'm just saying that, to me, it
> doesn't feel like the rate of change is all that relevant.

It's not that they're wrong, exactly -- I wouldn't say that. It's more
like this: you as a Postgres user actually care about a great many
things, not just one thing. Some of these things might be somewhat in
tension, from time to time. And so it seems wise to find a way to live
with any tension that may crop up -- by acknowledging the tension, we
get the chance to honor the preferences of the user to the greatest
extent possible.

[1] https://postgr.es/m/CAH2-WzmvXXEKtEph7U360umZ5pN3d18RBfu=nyPg9neBLDUWdw@mail.gmail.com
[2] https://en.wikipedia.org/wiki/Hysteretic_model
--
Peter Geoghegan



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: snapper and skink and fairywren (oh my!)
Следующее
От: Peter Smith
Дата:
Сообщение: Re: row filtering for logical replication