Обсуждение: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options toVACUUM

Поиск
Список
Период
Сортировка

Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options toVACUUM

От
"Bossart, Nathan"
Дата:
Hi hackers,

I've attached a patch for a couple of new options for VACUUM:
MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP.  The motive
behind these options is to allow table owners to easily vacuum only
the TOAST table or only the main relation.  This is especially useful
for TOAST tables since roles do not have access to the pg_toast schema
by default and some users may find it difficult to discover the name
of a relation's TOAST table.  Next, I will explain a couple of the
main design decisions.

I chose to call the option SECONDARY_RELATION_CLEANUP instead of
something like TOAST_TABLE_CLEANUP for two reasons.  First, other
types of secondary relations may be added in the future, and it may be
convenient to put them under the umbrella of this option.  Second, it
seemed like it could be outside of the project's style to use the name
of internal storage mechanisms in a user-facing VACUUM option.
However, I am not wedded to the chosen name, as I am sure there are
good arguments for something like TOAST_TABLE_CLEANUP.

I chose to implement MAIN_RELATION_CLEANUP within vacuum_rel() instead
of expand_vacuum_rel()/get_all_vacuum_rels().  This allows us to reuse
most of the existing code with minimal changes, and it avoids adding
complexity to the lookups and ownership checks in expand_vacuum_rel()
and get_all_vacuum_rels() (especially the partition lookup logic).
The main tradeoffs of this approach are that we will still create a
transaction for the main relation and that we will still lock the main
relation.

I reused the existing VACOPT_SKIPTOAST option to implement
SECONDARY_RELATION_CLEANUP.  This option is currently only used for
autovacuum.

I chose to disallow disabling both *_RELATION_CLEANUP options
together, as this would essentially cause the VACUUM command to take
no action.  I disallowed using FULL when SECONDARY_RELATION_CLEANUP is
disabled, as the TOAST table is automatically rebuilt by
cluster_rel().  I do allow using FULL when MAIN_RELATION_CLEANUP is
disabled, which is taken to mean that cluster_rel() should be run on
the TOAST table.  Finally, I disallowed using ANALYZE when
MAIN_RELATION_CLEANUP is disabled, as it is not presently possible to
analyze TOAST tables.

I will add this patch to the next commitfest.  I look forward to your
feedback.

Nathan


Вложения

Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP optionsto VACUUM

От
Vik Fearing
Дата:
On 21/01/2020 22:21, Bossart, Nathan wrote:
> I've attached a patch for a couple of new options for VACUUM:
> MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP.  The motive
> behind these options is to allow table owners to easily vacuum only
> the TOAST table or only the main relation.  This is especially useful
> for TOAST tables since roles do not have access to the pg_toast schema
> by default and some users may find it difficult to discover the name
> of a relation's TOAST table.


Could you explain why one would want to do this?  Autovacuum will
already deal with the tables separately as needed, but I don't see when
a manual vacuum would want to make this distinction.

-- 

Vik Fearing




Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP optionsto VACUUM

От
Michael Paquier
Дата:
On Tue, Jan 21, 2020 at 09:21:46PM +0000, Bossart, Nathan wrote:
> I've attached a patch for a couple of new options for VACUUM:
> MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP.  The motive
> behind these options is to allow table owners to easily vacuum only
> the TOAST table or only the main relation.  This is especially useful
> for TOAST tables since roles do not have access to the pg_toast schema
> by default and some users may find it difficult to discover the name
> of a relation's TOAST table.  Next, I will explain a couple of the
> main design decisions.

So that's similar to the autovacuum reloptions, but to be able to
enforce one policy or another manually.  Any issues with autovacuum
not able to keep up the bloat pace and where you need to issue manual
VACUUMs in periods of low activity, like nightly VACUUMs?

> I chose to call the option SECONDARY_RELATION_CLEANUP instead of
> something like TOAST_TABLE_CLEANUP for two reasons.  First, other
> types of secondary relations may be added in the future, and it may be
> convenient to put them under the umbrella of this option.  Second, it
> seemed like it could be outside of the project's style to use the name
> of internal storage mechanisms in a user-facing VACUUM option.
> However, I am not wedded to the chosen name, as I am sure there are
> good arguments for something like TOAST_TABLE_CLEANUP.

If other types of relations are added in the future, wouldn't it make
sense to have one switch for each one of those types then?  A relation
could have a toast relation associated to it, as much as a foo
relation or a hoge relation, in which case SECONDARY brings little
control.

> I chose to implement MAIN_RELATION_CLEANUP within vacuum_rel() instead
> of expand_vacuum_rel()/get_all_vacuum_rels().  This allows us to reuse
> most of the existing code with minimal changes, and it avoids adding
> complexity to the lookups and ownership checks in expand_vacuum_rel()
> and get_all_vacuum_rels() (especially the partition lookup logic).
> The main tradeoffs of this approach are that we will still create a
> transaction for the main relation and that we will still lock the main
> relation.

Yeah, likely we should not make things more confusing in this area.
This was tricky enough to deal with with the recent VACUUM
refactoring for multiple relations.

> I reused the existing VACOPT_SKIPTOAST option to implement
> SECONDARY_RELATION_CLEANUP.  This option is currently only used for
> autovacuum.

My take would be to rename this option, and reuse it for consistency.

> I chose to disallow disabling both *_RELATION_CLEANUP options
> together, as this would essentially cause the VACUUM command to take
> no action.

My first reaction is why?  Agreed that it is a bit crazy to combine
both options, but if you add the argument related to more relation
types like toast..
--
Michael

Вложения

Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP optionsto VACUUM

От
"Bossart, Nathan"
Дата:
On 1/21/20, 1:39 PM, "Vik Fearing" <vik.fearing@2ndquadrant.com> wrote:
> On 21/01/2020 22:21, Bossart, Nathan wrote:
>> I've attached a patch for a couple of new options for VACUUM:
>> MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP.  The motive
>> behind these options is to allow table owners to easily vacuum only
>> the TOAST table or only the main relation.  This is especially useful
>> for TOAST tables since roles do not have access to the pg_toast schema
>> by default and some users may find it difficult to discover the name
>> of a relation's TOAST table.
>
>
> Could you explain why one would want to do this?  Autovacuum will
> already deal with the tables separately as needed, but I don't see when
> a manual vacuum would want to make this distinction.

The main use case I'm targeting is when the level of bloat or
transaction ages of a relation and its TOAST table have significantly
diverged.  In these scenarios, it could be beneficial to be able to
vacuum just one or the other, especially if the tables are large.

Nathan


Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP optionsto VACUUM

От
"Bossart, Nathan"
Дата:
Hi Michael,

Thanks for taking a look.

On 1/21/20, 9:02 PM, "Michael Paquier" <michael@paquier.xyz> wrote:
> On Tue, Jan 21, 2020 at 09:21:46PM +0000, Bossart, Nathan wrote:
>> I've attached a patch for a couple of new options for VACUUM:
>> MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP.  The motive
>> behind these options is to allow table owners to easily vacuum only
>> the TOAST table or only the main relation.  This is especially useful
>> for TOAST tables since roles do not have access to the pg_toast schema
>> by default and some users may find it difficult to discover the name
>> of a relation's TOAST table.  Next, I will explain a couple of the
>> main design decisions.
>
> So that's similar to the autovacuum reloptions, but to be able to
> enforce one policy or another manually.  Any issues with autovacuum
> not able to keep up the bloat pace and where you need to issue manual
> VACUUMs in periods of low activity, like nightly VACUUMs?

There have been a couple of occasions where I have seen the TOAST
table become the most bloated part of the relation.  When this
happens, it would be handy to be able to avoid scanning the heap and
indexes.  I am not aware of any concrete problems with autovacuum
other than needing to tune the parameters for certain workloads.

>> I chose to call the option SECONDARY_RELATION_CLEANUP instead of
>> something like TOAST_TABLE_CLEANUP for two reasons.  First, other
>> types of secondary relations may be added in the future, and it may be
>> convenient to put them under the umbrella of this option.  Second, it
>> seemed like it could be outside of the project's style to use the name
>> of internal storage mechanisms in a user-facing VACUUM option.
>> However, I am not wedded to the chosen name, as I am sure there are
>> good arguments for something like TOAST_TABLE_CLEANUP.
>
> If other types of relations are added in the future, wouldn't it make
> sense to have one switch for each one of those types then?  A relation
> could have a toast relation associated to it, as much as a foo
> relation or a hoge relation, in which case SECONDARY brings little
> control.

This is a good point.  I've renamed the option to TOAST_TABLE_CLEANUP
in v2.

>> I chose to implement MAIN_RELATION_CLEANUP within vacuum_rel() instead
>> of expand_vacuum_rel()/get_all_vacuum_rels().  This allows us to reuse
>> most of the existing code with minimal changes, and it avoids adding
>> complexity to the lookups and ownership checks in expand_vacuum_rel()
>> and get_all_vacuum_rels() (especially the partition lookup logic).
>> The main tradeoffs of this approach are that we will still create a
>> transaction for the main relation and that we will still lock the main
>> relation.
>
> Yeah, likely we should not make things more confusing in this area.
> This was tricky enough to deal with with the recent VACUUM
> refactoring for multiple relations.

Finding a way to avoid the lock on the main relation could be a future
improvement, as that would allow you to manually vacuum both the main
relation and its TOAST table in parallel.

>> I reused the existing VACOPT_SKIPTOAST option to implement
>> SECONDARY_RELATION_CLEANUP.  This option is currently only used for
>> autovacuum.
>
> My take would be to rename this option, and reuse it for consistency.

Done.

>> I chose to disallow disabling both *_RELATION_CLEANUP options
>> together, as this would essentially cause the VACUUM command to take
>> no action.
>
> My first reaction is why?  Agreed that it is a bit crazy to combine
> both options, but if you add the argument related to more relation
> types like toast..

Yes, I suppose we have the same problem if you disable
MAIN_RELATION_CLEANUP and the relation has no TOAST table.  In any
case, allowing both options to be disabled shouldn't hurt anything.

Nathan


Вложения

Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP optionsto VACUUM

От
Michael Paquier
Дата:
On Fri, Jan 24, 2020 at 09:31:26PM +0000, Bossart, Nathan wrote:
> On 1/21/20, 9:02 PM, "Michael Paquier" <michael@paquier.xyz> wrote:
>> On Tue, Jan 21, 2020 at 09:21:46PM +0000, Bossart, Nathan wrote:
>>> I've attached a patch for a couple of new options for VACUUM:
>>> MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP.  The motive
>>> behind these options is to allow table owners to easily vacuum only
>>> the TOAST table or only the main relation.  This is especially useful
>>> for TOAST tables since roles do not have access to the pg_toast schema
>>> by default and some users may find it difficult to discover the name
>>> of a relation's TOAST table.  Next, I will explain a couple of the
>>> main design decisions.
>>
>> So that's similar to the autovacuum reloptions, but to be able to
>> enforce one policy or another manually.  Any issues with autovacuum
>> not able to keep up the bloat pace and where you need to issue manual
>> VACUUMs in periods of low activity, like nightly VACUUMs?
>
> There have been a couple of occasions where I have seen the TOAST
> table become the most bloated part of the relation.  When this
> happens, it would be handy to be able to avoid scanning the heap and
> indexes.  I am not aware of any concrete problems with autovacuum
> other than needing to tune the parameters for certain workloads.

That's something I have faced as well.  I have some applications
around here where toast tables were the most bloated, and the
vacuuming of the main relation ate time, putting more pressure on the
vacuuming of the toast relation.  So that's a fair argument in my
opinion.

>>> I chose to implement MAIN_RELATION_CLEANUP within vacuum_rel() instead
>>> of expand_vacuum_rel()/get_all_vacuum_rels().  This allows us to reuse
>>> most of the existing code with minimal changes, and it avoids adding
>>> complexity to the lookups and ownership checks in expand_vacuum_rel()
>>> and get_all_vacuum_rels() (especially the partition lookup logic).
>>> The main tradeoffs of this approach are that we will still create a
>>> transaction for the main relation and that we will still lock the main
>>> relation.
>>
>> Yeah, likely we should not make things more confusing in this area.
>> This was tricky enough to deal with with the recent VACUUM
>> refactoring for multiple relations.
>
> Finding a way to avoid the lock on the main relation could be a future
> improvement, as that would allow you to manually vacuum both the main
> relation and its TOAST table in parallel.

I am not sure that we actually need that at all, any catalog changes
take a lock on the parent relation first, and that's the conflicts we
are looking at here with a share update exclusive lock.
--
Michael

Вложения

Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP optionsto VACUUM

От
"Bossart, Nathan"
Дата:
On 1/24/20, 2:14 PM, "Bossart, Nathan" <bossartn@amazon.com> wrote:
>> Yes, I suppose we have the same problem if you disable
>> MAIN_RELATION_CLEANUP and the relation has no TOAST table.  In any
>> case, allowing both options to be disabled shouldn't hurt anything.
>
> I've been thinking further in this area, and I'm wondering if it also
> makes sense to remove the restriction on ANALYZE with
> MAIN_RELATION_CLEANUP disabled.  A command like
>
>         VACUUM (ANALYZE, MAIN_RELATION_CLEANUP FALSE) test;
>
> could be interpreted as meaning we should vacuum the TOAST table and
> analyze the main relation.  Since the word "cleanup" is present in the
> option name, this might not be too confusing.

I've attached v3 of the patch, which removes the restriction on
ANALYZE with MAIN_RELATION_CLEANUP disabled.

Nathan


Вложения

Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP optionsto VACUUM

От
"Bossart, Nathan"
Дата:
Here is a rebased version of the patch.

Nathan


Вложения

Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options to VACUUM

От
Justin Pryzby
Дата:
On Sun, May 31, 2020 at 10:13:39PM +0000, Bossart, Nathan wrote:
> Here is a rebased version of the patch.

Should bin/vacuumdb support this?

Should vacuumdb have a way to pass an arbitrary option to the server, instead
of tacking on options (which are frequently forgotten on the initial commit to
the backend VACUUM command) ?  That has the advantage that vacuumdb could use
new options even when connecting to a new server version than client.  I think
it would be safe as long as it avoided characters like ')' and ';'.  Maybe
all that's needed is isdigit() || isalpha() || isspace() || c=='_'

+    MAIN_RELATION_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
+    TOAST_TABLE_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]

Maybe should be called TOAST_RELATION_CLEANUP 

See attached.

-- 
Justin

Вложения

Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options to VACUUM

От
"Bossart, Nathan"
Дата:
Hi,

Thanks for taking a look.

On 7/13/20, 11:02 AM, "Justin Pryzby" <pryzby@telsasoft.com> wrote:
> Should bin/vacuumdb support this?

Yes, it should.  I've added it in v5 of the patch.

> Should vacuumdb have a way to pass an arbitrary option to the server, instead
> of tacking on options (which are frequently forgotten on the initial commit to
> the backend VACUUM command) ?  That has the advantage that vacuumdb could use
> new options even when connecting to a new server version than client.  I think
> it would be safe as long as it avoided characters like ')' and ';'.  Maybe
> all that's needed is isdigit() || isalpha() || isspace() || c=='_'

I like the idea of allowing users to specify arbitrary options so that
they are not constrained to the options in the version of vacuumdb
they are using.  I suspect we will still want to keep the vacuumdb
options updated for consistency and ease-of-use, though.  IMO this
deserves its own thread.

> +    MAIN_RELATION_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
> +    TOAST_TABLE_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ]
>
> Maybe should be called TOAST_RELATION_CLEANUP

While using "relation" would be more consistent with the
MAIN_RELATION_CLEANUP option, I initially chose "table" for
consistency with most of the documentation [0].  Thinking further, I
believe this is still the right choice.  While the term "relation"
refers to any type of object tracked in pg_class [1], a TOAST table
can only ever be a TOAST table.  There are no other special TOAST
relation types (e.g. sequences, materialized views).  On the other
hand, it is possible to vacuum other types of "main relations" besides
regular tables (e.g. materialized views), so MAIN_RELATION_CLEANUP
also seems right to me.  Thoughts?

Nathan

[0] https://www.postgresql.org/docs/devel/storage-toast.html
[1] https://www.postgresql.org/docs/devel/catalog-pg-class.html


Вложения

RE: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options to VACUUM

От
"k.jamison@fujitsu.com"
Дата:
On Tuesday, July 14, 2020 3:01 AM (GMT+9), Bossart, Nathan wrote:

Hi Nathan,

>On 7/13/20, 11:02 AM, "Justin Pryzby" <pryzby(at)telsasoft(dot)com> wrote:
>> Should bin/vacuumdb support this?
>
>Yes, it should.  I've added it in v5 of the patch.

Thank you for the updated patch. I've joined as a reviewer.
I've also noticed that you have incorporated Justin's suggested vacuumdb support
in the recent patch, but in my opinion it'd be better to split them for better readability.
According to the cfbot, patch applies cleanly and passes all the tests.

[Use Case]
>The main use case I'm targeting is when the level of bloat or
>transaction ages of a relation and its TOAST table have significantly
>diverged.  In these scenarios, it could be beneficial to be able to
>vacuum just one or the other, especially if the tables are large.
>...
>I reused the existing VACOPT_SKIPTOAST option to implement
>SECONDARY_RELATION_CLEANUP.  This option is currently only used for
>autovacuum.

Perhaps this has not gathered much attention yet because it's not experienced
by many, but I don't see any problem with the additional options on manual
VACUUM on top of existing autovacuum cleanups. And I think this is useful
for the special use case mentioned, especially that toast table access is not
in public as per role limitation.

[Docs]
I also agree with "TOAST_TABLE_CLEANUP" and just name the options after the
respective proposed relation types in the future.

+    <term><literal>MAIN_RELATION_CLEANUP</literal></term>
+    <listitem>
+     <para>
+      Specifies that <command>VACUUM</command> should attempt to process the
+      main relation.  This is normally the desired behavior and is the default.
+      Setting this option to false may be useful when it is necessary to only
+      vacuum a relation's corresponding <literal>TOAST</literal> table.

Perhaps it's just my own opinion, but I think the word "process" is vague for
a beginner in postgres reading the documents. OTOH, I know it's also used
in the source code, so I guess it's just the convention. And "process" is
intuititve as "processing tables". Anyway, just my 2 cents & isn't a strong
opinion.

Also, there's an extra space between the 1st and 2nd sentences.


+    <term><literal>TOAST_TABLE_CLEANUP</literal></term>
+    <listitem>
+     <para>
+      Specifies that <command>VACUUM</command> should attempt to process the
+      corresponding <literal>TOAST</literal> table for each relation, if one
+      exists.  This is normally the desired behavior and is the default.
+      Setting this option to false may be useful when it is necessary to only
+      vacuum the main relation.  This option cannot be disabled when the
+      <literal>FULL</literal> option is used.

Same comments as above, & extra spaces in between the sentences. 

@@ -1841,9 +1865,16 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params)
     /*
      * Remember the relation's TOAST relation for later, if the caller asked
      * us to process it.  In VACUUM FULL, though, the toast table is
-     * automatically rebuilt by cluster_rel so we shouldn't recurse to it.
+     * automatically rebuilt by cluster_rel, so we shouldn't recurse to it
+     * unless MAIN_RELATION_CLEANUP is disabled.

The additional last line is a bit confusing (and may be unnecessary/unrelated).
To clarify this thread on VACUUM FULL and my understanding of revised vacuum_rel below,
we allow MAIN_RELATION_CLEANUP option to be disabled (skip processing main relation)
and TOAST_TABLE_CLEANUP should be disabled because cluster_rel() will process the 
toast table anyway.
Is my understanding correct? If yes, then maybe "unless" should be "even if" instead,
or we can just remove the line.

 static bool
-vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params)
+vacuum_rel(Oid relid,
+           RangeVar *relation,
+           VacuumParams *params,
+           bool processing_toast_table)
{
...
+    bool        process_toast;
...

-    if (!(params->options & VACOPT_SKIPTOAST) && !(params->options & VACOPT_FULL))
+    process_toast = (params->options & VACOPT_TOAST_CLEANUP) != 0;
+
+    if ((params->options & VACOPT_FULL) != 0 &&
+        (params->options & VACOPT_MAIN_REL_CLEANUP) != 0)
+        process_toast = false;
+
+    if (process_toast)
         toast_relid = onerel->rd_rel->reltoastrelid;
     else
         toast_relid = InvalidOid;
...

      * Do the actual work --- either FULL or "lazy" vacuum
+     *
+     * We skip this part if we're processing the main relation and
+     * MAIN_RELATION_CLEANUP has been disabled.
      */
-    if (params->options & VACOPT_FULL)
+    if ((params->options & VACOPT_MAIN_REL_CLEANUP) != 0 ||
+        processing_toast_table)
...
     if (toast_relid != InvalidOid)
-        vacuum_rel(toast_relid, NULL, params);
+        vacuum_rel(toast_relid, NULL, params, true);



>I've attached v3 of the patch, which removes the restriction on
>ANALYZE with MAIN_RELATION_CLEANUP disabled.

I've also confirmed those through regression + tap test in my own env
and they've passed. I'll look into deeply again if I find problems.

I think this follows the similar course of previously added VACUUM and
vacuummdb options (for using and skipping truncate, index cleanup, etc.),
so the patch seems almost plausible enough for me.

Regards,
Kirk

Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options to VACUUM

От
Michael Paquier
Дата:
On Tue, Jul 14, 2020 at 05:34:01AM +0000, k.jamison@fujitsu.com wrote:
> I've also confirmed those through regression + tap test in my own env
> and they've passed. I'll look into deeply again if I find problems.

+   VACOPT_TOAST_CLEANUP = 1 << 6,  /* process TOAST table, if any */
+   VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7,  /* don't skip any pages */
+   VACOPT_MAIN_REL_CLEANUP = 1 << 8    /* process main relation */
 } VacuumOption;

Do we actually need this much complication in the option set?  It is
possible to vacuum directly a toast table by passing directly its
relation name, with pg_toast as schema, so you can already vacuum a
toast relation without the main part.  And I would guess that users
caring about the toast table specifically would know already how to do
that, even if it requires a simple script and a query on pg_class.
Now there is a second part, where we'd like to vacuum the main
relation but not its toast table.  My feeling by looking at this patch
today is that we could just make VACOPT_SKIPTOAST an option available
at user-level, and support all the cases discussed on this thread.
And we have already all the code in place to support that in the
backend for autovacuum as relations are processed individually,
without their toast tables if they have one.

> I think this follows the similar course of previously added VACUUM and
> vacuummdb options (for using and skipping truncate, index cleanup, etc.),
> so the patch seems almost plausible enough for me.

-static bool vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params);
+static bool vacuum_rel(Oid relid,
+                      RangeVar *relation,
+                      VacuumParams *params,
+                      bool processing_toast_table);

Not much a fan of the addition of this parameter on this routine to
track down if the call should process a toast relation or not.
Couldn't you just prevent the call to vacuum_rel() to happen at all?
--
Michael

Вложения

Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options to VACUUM

От
"Bossart, Nathan"
Дата:
On 8/2/20, 11:47 PM, "Michael Paquier" <michael@paquier.xyz> wrote:
> +   VACOPT_TOAST_CLEANUP = 1 << 6,  /* process TOAST table, if any */
> +   VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7,  /* don't skip any pages */
> +   VACOPT_MAIN_REL_CLEANUP = 1 << 8    /* process main relation */
>  } VacuumOption;
>
> Do we actually need this much complication in the option set?  It is
> possible to vacuum directly a toast table by passing directly its
> relation name, with pg_toast as schema, so you can already vacuum a
> toast relation without the main part.  And I would guess that users
> caring about the toast table specifically would know already how to do
> that, even if it requires a simple script and a query on pg_class.
> Now there is a second part, where we'd like to vacuum the main
> relation but not its toast table.  My feeling by looking at this patch
> today is that we could just make VACOPT_SKIPTOAST an option available
> at user-level, and support all the cases discussed on this thread.
> And we have already all the code in place to support that in the
> backend for autovacuum as relations are processed individually,
> without their toast tables if they have one.

My main motive for adding the MAIN_RELATION_CLEANUP option is to allow
table owners to easily vacuum only a relation's TOAST table.  Roles do
not have access to the pg_toast schema by default, so they might be
restricted from vacuuming their TOAST tables directly.

> -static bool vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params);
> +static bool vacuum_rel(Oid relid,
> +                      RangeVar *relation,
> +                      VacuumParams *params,
> +                      bool processing_toast_table);
>
> Not much a fan of the addition of this parameter on this routine to
> track down if the call should process a toast relation or not.
> Couldn't you just prevent the call to vacuum_rel() to happen at all?

I think it would be possible to skip calling vacuum_rel() from
expand_vacuum_rel()/get_all_vacuum_rels() as appropriate, but when I
looked into that approach originally, I was concerned that it would
add complexity to the lookups and ownership checks (especially the
partition lookup logic).  The main tradeoffs of the approach I went
with are that we still create a transaction for the main relation and
that we still lock the main relation.

Nathan


Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options to VACUUM

От
Masahiko Sawada
Дата:
On Mon, 3 Aug 2020 at 15:47, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Tue, Jul 14, 2020 at 05:34:01AM +0000, k.jamison@fujitsu.com wrote:
> > I've also confirmed those through regression + tap test in my own env
> > and they've passed. I'll look into deeply again if I find problems.
>
> +   VACOPT_TOAST_CLEANUP = 1 << 6,  /* process TOAST table, if any */
> +   VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7,  /* don't skip any pages */
> +   VACOPT_MAIN_REL_CLEANUP = 1 << 8    /* process main relation */
>  } VacuumOption;
>
> Do we actually need this much complication in the option set?  It is
> possible to vacuum directly a toast table by passing directly its
> relation name, with pg_toast as schema, so you can already vacuum a
> toast relation without the main part.  And I would guess that users
> caring about the toast table specifically would know already how to do
> that, even if it requires a simple script and a query on pg_class.

Yeah, I also doubt we really need to have this option in the core just
for the purpose of easily specifying toast relation to VACUUM command.
If the user doesn't know how to search the toast relation, I think we
can provide a script or an SQL function executes vacuum() C function
with the toast relation fetched by using the main relation. I
personally think VACUUM option basically should be present to control
the vacuum internal behavior granularly that the user cannot control
from outside, although there are some exceptions: FREEZE and ANALYZE.

Regards,


--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options to VACUUM

От
Michael Paquier
Дата:
On Wed, Aug 05, 2020 at 12:56:48AM +0000, Bossart, Nathan wrote:
> My main motive for adding the MAIN_RELATION_CLEANUP option is to allow
> table owners to easily vacuum only a relation's TOAST table.  Roles do
> not have access to the pg_toast schema by default, so they might be
> restricted from vacuuming their TOAST tables directly.

True that you need an extra GRANT USAGE ON pg_toast to achieve that
for users with no privileges, but that's not impossible now either.  I
am not sure that this use-case justifies a new option and more
complications in the code paths of vacuum though.  So let's see first
if others have an opinion to offer.
--
Michael

Вложения

Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options to VACUUM

От
Michael Paquier
Дата:
On Thu, Aug 06, 2020 at 11:50:06AM +0900, Michael Paquier wrote:
> True that you need an extra GRANT USAGE ON pg_toast to achieve that
> for users with no privileges, but that's not impossible now either.  I
> am not sure that this use-case justifies a new option and more
> complications in the code paths of vacuum though.  So let's see first
> if others have an opinion to offer.

Seeing nothing happening here, I am marking the CF entry as returned
with feedback.  FWIW, I still tend to think that we could call this
stuff a day if we had an option to skip a toast relation when willing
to vacuum the parent relation.
--
Michael

Вложения

Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options to VACUUM

От
Justin Pryzby
Дата:
On Fri, Jan 24, 2020 at 09:24:45PM +0000, Bossart, Nathan wrote:
> On 1/21/20, 1:39 PM, "Vik Fearing" <vik.fearing@2ndquadrant.com> wrote:
> > On 21/01/2020 22:21, Bossart, Nathan wrote:
> >> I've attached a patch for a couple of new options for VACUUM:
> >> MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP.  The motive
> >> behind these options is to allow table owners to easily vacuum only
> >> the TOAST table or only the main relation.  This is especially useful
> >> for TOAST tables since roles do not have access to the pg_toast schema
> >> by default and some users may find it difficult to discover the name
> >> of a relation's TOAST table.
> >
> >
> > Could you explain why one would want to do this?  Autovacuum will
> > already deal with the tables separately as needed, but I don't see when
> > a manual vacuum would want to make this distinction.
> 
> The main use case I'm targeting is when the level of bloat or
> transaction ages of a relation and its TOAST table have significantly
> diverged.  In these scenarios, it could be beneficial to be able to
> vacuum just one or the other, especially if the tables are large.

This just came up for me:

I have a daily maintenance script which pro-actively vacuums tables: freezing
historic partitions, vacuuming current tables if the table's relfrozenxid is
old, and to encourage indexonly scan.

I'm checking the greatest(age(toast,main)) and vacuum the table (and implicitly
its toast) whenever either is getting old.

But it'd be more ideal if I could independently vacuum the main table if it's
old, but not the toast table.

-- 
Justin



Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options to VACUUM

От
"Bossart, Nathan"
Дата:
On 1/27/21, 11:07 AM, "Justin Pryzby" <pryzby@telsasoft.com> wrote:
> This just came up for me:
>
> I have a daily maintenance script which pro-actively vacuums tables: freezing
> historic partitions, vacuuming current tables if the table's relfrozenxid is
> old, and to encourage indexonly scan.
>
> I'm checking the greatest(age(toast,main)) and vacuum the table (and implicitly
> its toast) whenever either is getting old.
>
> But it'd be more ideal if I could independently vacuum the main table if it's
> old, but not the toast table.

Thanks for chiming in.

It looks like we were leaning towards only adding the
TOAST_TABLE_CLEANUP option, which is already implemented internally
with VACOPT_SKIPTOAST.  It's already possible to vacuum a TOAST table
directly, so we can probably do without the MAIN_RELATION_CLEANUP
option.

I've attached a new patch that only adds TOAST_TABLE_CLEANUP.

Nathan


Вложения

Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options to VACUUM

От
Justin Pryzby
Дата:
On Wed, Jan 27, 2021 at 11:16:26PM +0000, Bossart, Nathan wrote:
> On 1/27/21, 11:07 AM, "Justin Pryzby" <pryzby@telsasoft.com> wrote:
> > This just came up for me:
> >
> > I have a daily maintenance script which pro-actively vacuums tables: freezing
> > historic partitions, vacuuming current tables if the table's relfrozenxid is
> > old, and to encourage indexonly scan.
> >
> > I'm checking the greatest(age(toast,main)) and vacuum the table (and implicitly
> > its toast) whenever either is getting old.
> >
> > But it'd be more ideal if I could independently vacuum the main table if it's
> > old, but not the toast table.
> 
> Thanks for chiming in.
> 
> It looks like we were leaning towards only adding the
> TOAST_TABLE_CLEANUP option, which is already implemented internally
> with VACOPT_SKIPTOAST.  It's already possible to vacuum a TOAST table
> directly, so we can probably do without the MAIN_RELATION_CLEANUP
> option.
> 
> I've attached a new patch that only adds TOAST_TABLE_CLEANUP.

Thanks, I wrote my message after running into the issue and remembered this
thread.  I didn't necessarily mean to send another patch :)

My only comment is on the name: TOAST_TABLE_CLEANUP.  "Cleanup" suggests that
the (main or only) purpose is to "clean" dead tuples to avoid bloat.  But in my
use case, the main purpose is to avoid XID wraparound (or its warnings).

Okay, my second only comment is that this:

| This option cannot be disabled when the <literal>FULL</literal> option is
| used.

Should it instead be ignored if FULL is also specified ?  Currently only
PARALLEL and DISABLE_PAGE_SKIPPING cause an error when used with FULL.  That's
documented for PARALLEL, but I think it should also be documented for
DISABLE_PAGE_SKIPPING (which is however an advanced option).

-- 
Justin



Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options to VACUUM

От
"Bossart, Nathan"
Дата:
On 1/27/21, 5:08 PM, "Justin Pryzby" <pryzby@telsasoft.com> wrote:
> Thanks, I wrote my message after running into the issue and remembered this
> thread.  I didn't necessarily mean to send another patch :)

No worries.  I lost track of this thread, but I don't mind picking it
up again.

> My only comment is on the name: TOAST_TABLE_CLEANUP.  "Cleanup" suggests that
> the (main or only) purpose is to "clean" dead tuples to avoid bloat.  But in my
> use case, the main purpose is to avoid XID wraparound (or its warnings).

I chose TOAST_TABLE_CLEANUP to match the INDEX_CLEANUP option, but I'm
not wedded to that name.  What do you think about PROCESS_TOAST_TABLE?

> Okay, my second only comment is that this:
>
> | This option cannot be disabled when the <literal>FULL</literal> option is
> | used.
>
> Should it instead be ignored if FULL is also specified ?  Currently only
> PARALLEL and DISABLE_PAGE_SKIPPING cause an error when used with FULL.  That's
> documented for PARALLEL, but I think it should also be documented for
> DISABLE_PAGE_SKIPPING (which is however an advanced option).

IMO we should emit an ERROR in this case.  If we ignored it, we'd end
up processing the TOAST table even though the user asked us to skip
it.

Nathan


Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options to VACUUM

От
Michael Paquier
Дата:
On Thu, Jan 28, 2021 at 06:16:09PM +0000, Bossart, Nathan wrote:
> I chose TOAST_TABLE_CLEANUP to match the INDEX_CLEANUP option, but I'm
> not wedded to that name.  What do you think about PROCESS_TOAST_TABLE?

Most of the other options use a verb, so using PROCESS, or even SKIP
sounds like a good idea.  More ideas: PROCESS_TOAST, SKIP_TOAST.  I
don't like much the term CLEANUP here, as it may imply, at least to
me, that the toast relation is getting partially processed.

> IMO we should emit an ERROR in this case.  If we ignored it, we'd end
> up processing the TOAST table even though the user asked us to skip
> it.

Issuing an error makes the most sense to me per the argument based on
cluster_rel() and copy_table_data().  Silently ignoring options can be
confusing for the end-user.

+       <para>
+        Do not clean up the TOAST table.
+       </para>
Is that enough?  I would say instead: "Skip the TOAST table associated
to the table to vacuum, if any."
--
Michael

Вложения

Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options to VACUUM

От
"Bossart, Nathan"
Дата:
On 1/28/21, 11:15 PM, "Michael Paquier" <michael@paquier.xyz> wrote:
> On Thu, Jan 28, 2021 at 06:16:09PM +0000, Bossart, Nathan wrote:
>> I chose TOAST_TABLE_CLEANUP to match the INDEX_CLEANUP option, but I'm
>> not wedded to that name.  What do you think about PROCESS_TOAST_TABLE?
>
> Most of the other options use a verb, so using PROCESS, or even SKIP
> sounds like a good idea.  More ideas: PROCESS_TOAST, SKIP_TOAST.  I
> don't like much the term CLEANUP here, as it may imply, at least to
> me, that the toast relation is getting partially processed.

I changed it to PROCESS_TOAST.

> +       <para>
> +        Do not clean up the TOAST table.
> +       </para>
> Is that enough?  I would say instead: "Skip the TOAST table associated
> to the table to vacuum, if any."

Done.

Nathan


Вложения

Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options to VACUUM

От
Michael Paquier
Дата:
On Fri, Jan 29, 2021 at 06:43:44PM +0000, Bossart, Nathan wrote:
> I changed it to PROCESS_TOAST.

Thanks.  PROCESS_TOAST sounds good to me at the end for the option
name, so let's just go with that.

> Done.

While on it, I could not resist with changing VACOPT_SKIPTOAST to
VACOPT_PROCESS_TOAST on consistency grounds.  This is used only in
four places in the code, so that's not invasive.

What do you think?
--
Michael

Вложения

Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options to VACUUM

От
Justin Pryzby
Дата:
On Mon, Feb 08, 2021 at 04:35:19PM +0900, Michael Paquier wrote:
> On Fri, Jan 29, 2021 at 06:43:44PM +0000, Bossart, Nathan wrote:
> > I changed it to PROCESS_TOAST.
> 
> Thanks.  PROCESS_TOAST sounds good to me at the end for the option
> name, so let's just go with that.
> 
> > Done.
> 
> While on it, I could not resist with changing VACOPT_SKIPTOAST to
> VACOPT_PROCESS_TOAST on consistency grounds.  This is used only in
> four places in the code, so that's not invasive.

+1

> @@ -971,6 +998,7 @@ help(const char *progname)
>      printf(_("      --min-mxid-age=MXID_AGE     minimum multixact ID age of tables to vacuum\n"));
>      printf(_("      --min-xid-age=XID_AGE       minimum transaction ID age of tables to vacuum\n"));
>      printf(_("      --no-index-cleanup          don't remove index entries that point to dead tuples\n"));
> +    printf(_("      --no-process-toast          skip the TOAST table associated to the table to vacuum, if
any\n"));

say "associated WITH"

> +      corresponding <literal>TOAST</literal> table for each relation, if one
> +      exists. This is normally the desired behavior and is the default.
> +      Setting this option to false may be useful when it is necessary to only

Maybe it should say "when it is only necessary to"
But what you've written isn't wrong, depending on what you mean.

> @@ -244,6 +244,21 @@ PostgreSQL documentation
> +        Skip the TOAST table associated to the table to vacuum, if any.

associatd with

-- 
Justin



Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options to VACUUM

От
"Bossart, Nathan"
Дата:
On 2/8/21, 12:47 AM, "Justin Pryzby" <pryzby@telsasoft.com> wrote:
> On Mon, Feb 08, 2021 at 04:35:19PM +0900, Michael Paquier wrote:
>> On Fri, Jan 29, 2021 at 06:43:44PM +0000, Bossart, Nathan wrote:
>> > I changed it to PROCESS_TOAST.
>>
>> Thanks.  PROCESS_TOAST sounds good to me at the end for the option
>> name, so let's just go with that.
>>
>> > Done.
>>
>> While on it, I could not resist with changing VACOPT_SKIPTOAST to
>> VACOPT_PROCESS_TOAST on consistency grounds.  This is used only in
>> four places in the code, so that's not invasive.
>
> +1

+1

>> @@ -971,6 +998,7 @@ help(const char *progname)
>>       printf(_("      --min-mxid-age=MXID_AGE     minimum multixact ID age of tables to vacuum\n"));
>>       printf(_("      --min-xid-age=XID_AGE       minimum transaction ID age of tables to vacuum\n"));
>>       printf(_("      --no-index-cleanup          don't remove index entries that point to dead tuples\n"));
>> +     printf(_("      --no-process-toast          skip the TOAST table associated to the table to vacuum, if
any\n"));
>
> say "associated WITH"
>
>> +      corresponding <literal>TOAST</literal> table for each relation, if one
>> +      exists. This is normally the desired behavior and is the default.
>> +      Setting this option to false may be useful when it is necessary to only
>
> Maybe it should say "when it is only necessary to"
> But what you've written isn't wrong, depending on what you mean.
>
>> @@ -244,6 +244,21 @@ PostgreSQL documentation
>> +        Skip the TOAST table associated to the table to vacuum, if any.
>
> associatd with

These suggestions seem reasonable to me.  I've applied them in v9.

Nathan


Вложения

Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options to VACUUM

От
Michael Paquier
Дата:
On Mon, Feb 08, 2021 at 06:59:45PM +0000, Bossart, Nathan wrote:
> These suggestions seem reasonable to me.  I've applied them in v9.

Sounds good to me, so applied.
--
Michael

Вложения

Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options to VACUUM

От
"Bossart, Nathan"
Дата:
On 2/8/21, 9:19 PM, "Michael Paquier" <michael@paquier.xyz> wrote:
> On Mon, Feb 08, 2021 at 06:59:45PM +0000, Bossart, Nathan wrote:
>> These suggestions seem reasonable to me.  I've applied them in v9.
>
> Sounds good to me, so applied.

Thanks!

Nathan