Re: do only critical work during single-user vacuum?

Поиск
Список
Период
Сортировка
От Masahiko Sawada
Тема Re: do only critical work during single-user vacuum?
Дата
Msg-id CAD21AoCj-uRMkWFaZkDpKaDWcJ4BCuoYVWPRCuGDPcNNwYVy_Q@mail.gmail.com
обсуждение исходный текст
Ответ на Re: do only critical work during single-user vacuum?  ("Bossart, Nathan" <bossartn@amazon.com>)
Ответы Re: do only critical work during single-user vacuum?  (John Naylor <john.naylor@enterprisedb.com>)
Re: do only critical work during single-user vacuum?  ("Bossart, Nathan" <bossartn@amazon.com>)
Список pgsql-hackers
On Fri, Jan 14, 2022 at 7:04 AM Bossart, Nathan <bossartn@amazon.com> wrote:
>
> On 1/13/22, 4:58 AM, "John Naylor" <john.naylor@enterprisedb.com> wrote:
> > On Wed, Jan 12, 2022 at 12:26 PM Bossart, Nathan <bossartn@amazon.com> wrote:
> >> As I've stated upthread, Sawada-san's suggested approach was my
> >> initial reaction to this thread.  I'm not wedded to the idea of adding
> >> new options, but I think there are a couple of advantages.  For both
> >> single-user mode and normal operation (which may be in imminent
> >> wraparound danger), you could use the same command:
> >>
> >>         VACUUM (MIN_XID_AGE 1600000000, ...);
> >
> > My proposed top-level statement can also be used in normal operation,
> > so the only possible advantage is configurability. But I don't really
> > see any advantage in that -- I don't think we should be moving in the
> > direction of adding more-intricate ways to paper over the deficiencies
> > in autovacuum scheduling. (It could be argued that I'm doing exactly
> > that in this whole thread, but [imminent] shutdown situations have
> > other causes besides deficient scheduling.)
>
> The new top-level command would be configurable, right?  Your patch
> uses autovacuum_freeze_max_age/autovacuum_multixact_freeze_max_age, so
> the behavior of this new command now depends on the values of
> parameters that won't obviously be related to it.  If these parameters
> are set very low (e.g., the default values), then this command will
> end up doing far more work than is probably necessary.
>
> If we did go the route of using a parameter to determine which tables
> to vacuum, I think vacuum_failsafe_age is a much better candidate, as
> it defaults to a much higher value that is more likely to prevent
> doing extra work.  That being said, I don't know if overloading
> parameters is the right way to go.
>
> >> (As an aside, we'd need to figure out how XID and MXID options would
> >> work together.  Presumably most users would want to OR them.)
> >>
> >> This doesn't really tie in super nicely with the failsafe mechanism,
> >> but adding something like a FAILSAFE option doesn't seem right to me,
> >
> > I agree -- it would be awkward and messy as an option. However, I see
> > the same problem with xid/mxid -- I would actually argue they are not
> > even proper options; they are "selectors". Your comments above about
> > 1) needing to OR them and 2) emitting a message when a VACUUM command
> > doesn't actually do anything are evidence of that fact.
>
> That's a fair point.  But I don't think these problems are totally
> intractable.  We already emit "skipping" messages from VACUUM
> sometimes, and interactions between VACUUM options exist today, too.
> For example, FREEZE is redundant when FULL is specified, and
> INDEX_CLEANUP is totally ignored when FULL is used.
>
> >> The other advantage I see with age-related options is that it can be
> >> useful for non-imminent-wraparound situations as well.  For example,
> >> maybe a user just wants to manually vacuum everything (including
> >> indexes) with an age above 500M on the weekends.

I also think there is a use case where a user just wants to manually
vacuum tables that are older than a certain threshold. In this case,
they might want to specify VACUUM command options such as the parallel
option while selecting tables.

> >
> > There is already vaccumdb for that, and I think it's method of
> > selecting tables is sound -- I'm not convinced that pushing table
> > selection to the server command as "options" is an improvement.

I think that having the user not rely on vacuumdb by implementing it
on the server side would be an improvement.

> I guess I'm ultimately imagining the new options as replacing the
> vacuumdb implementation.  IOW vacuumdb would just use MIN_(M)XID_AGE
> behind the scenes (as would a new top-level command).

I had the same idea.

That having been said, I agree that xid/mxid options are different
things from the existing VACUUM command options; whereas the existing
VACUUM options control its behavior, xid/mxid options are selectors
for tables to vacuum (PROCESS_TOAST option could be a selector but I
think it’s slightly different from xid/mxid options).

IIUC what we want to do here are two things: (1) select only old
tables and (2) set INDEX_CLEANUP = off, TRUNCATE = off, and FREEZE =
on. VACUUM LIMIT statement does both things at the same time. Although
I’m concerned a bit about its flexibility, it’s a reasonable solution.

On the other hand, it’s probably also useful to do either one thing in
some cases. For instance, having a selector for (1) would be useful,
and having a new option like FAST_FREEZE for (2) would also be useful.
Given there is already a way for (2) (it does not default though), I
think it might also be a good start inventing something for (1). For
instance, a selector for VACUUM statement I came up with is:

VACUUM (verbose on) TABLES WITH (min_xid_age = 1600000000);
or
VACUUM (verbose on) TABLES WITH (min_age = failsafe_limit);

We can expand it in the future to select tables by, for example, dead
tuple ratio, size, etc.

It's a random thought but maybe worth considering.


Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Pavel Stehule
Дата:
Сообщение: Re: Schema variables - new implementation for Postgres 15
Следующее
От: vignesh C
Дата:
Сообщение: Re: Skipping logical replication transactions on subscriber side