Обсуждение: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options toVACUUM
Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options toVACUUM
От
"Bossart, Nathan"
Дата:
Hi hackers, I've attached a patch for a couple of new options for VACUUM: MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP. The motive behind these options is to allow table owners to easily vacuum only the TOAST table or only the main relation. This is especially useful for TOAST tables since roles do not have access to the pg_toast schema by default and some users may find it difficult to discover the name of a relation's TOAST table. Next, I will explain a couple of the main design decisions. I chose to call the option SECONDARY_RELATION_CLEANUP instead of something like TOAST_TABLE_CLEANUP for two reasons. First, other types of secondary relations may be added in the future, and it may be convenient to put them under the umbrella of this option. Second, it seemed like it could be outside of the project's style to use the name of internal storage mechanisms in a user-facing VACUUM option. However, I am not wedded to the chosen name, as I am sure there are good arguments for something like TOAST_TABLE_CLEANUP. I chose to implement MAIN_RELATION_CLEANUP within vacuum_rel() instead of expand_vacuum_rel()/get_all_vacuum_rels(). This allows us to reuse most of the existing code with minimal changes, and it avoids adding complexity to the lookups and ownership checks in expand_vacuum_rel() and get_all_vacuum_rels() (especially the partition lookup logic). The main tradeoffs of this approach are that we will still create a transaction for the main relation and that we will still lock the main relation. I reused the existing VACOPT_SKIPTOAST option to implement SECONDARY_RELATION_CLEANUP. This option is currently only used for autovacuum. I chose to disallow disabling both *_RELATION_CLEANUP options together, as this would essentially cause the VACUUM command to take no action. I disallowed using FULL when SECONDARY_RELATION_CLEANUP is disabled, as the TOAST table is automatically rebuilt by cluster_rel(). I do allow using FULL when MAIN_RELATION_CLEANUP is disabled, which is taken to mean that cluster_rel() should be run on the TOAST table. Finally, I disallowed using ANALYZE when MAIN_RELATION_CLEANUP is disabled, as it is not presently possible to analyze TOAST tables. I will add this patch to the next commitfest. I look forward to your feedback. Nathan
Вложения
On 21/01/2020 22:21, Bossart, Nathan wrote: > I've attached a patch for a couple of new options for VACUUM: > MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP. The motive > behind these options is to allow table owners to easily vacuum only > the TOAST table or only the main relation. This is especially useful > for TOAST tables since roles do not have access to the pg_toast schema > by default and some users may find it difficult to discover the name > of a relation's TOAST table. Could you explain why one would want to do this? Autovacuum will already deal with the tables separately as needed, but I don't see when a manual vacuum would want to make this distinction. -- Vik Fearing
Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP optionsto VACUUM
От
Michael Paquier
Дата:
On Tue, Jan 21, 2020 at 09:21:46PM +0000, Bossart, Nathan wrote: > I've attached a patch for a couple of new options for VACUUM: > MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP. The motive > behind these options is to allow table owners to easily vacuum only > the TOAST table or only the main relation. This is especially useful > for TOAST tables since roles do not have access to the pg_toast schema > by default and some users may find it difficult to discover the name > of a relation's TOAST table. Next, I will explain a couple of the > main design decisions. So that's similar to the autovacuum reloptions, but to be able to enforce one policy or another manually. Any issues with autovacuum not able to keep up the bloat pace and where you need to issue manual VACUUMs in periods of low activity, like nightly VACUUMs? > I chose to call the option SECONDARY_RELATION_CLEANUP instead of > something like TOAST_TABLE_CLEANUP for two reasons. First, other > types of secondary relations may be added in the future, and it may be > convenient to put them under the umbrella of this option. Second, it > seemed like it could be outside of the project's style to use the name > of internal storage mechanisms in a user-facing VACUUM option. > However, I am not wedded to the chosen name, as I am sure there are > good arguments for something like TOAST_TABLE_CLEANUP. If other types of relations are added in the future, wouldn't it make sense to have one switch for each one of those types then? A relation could have a toast relation associated to it, as much as a foo relation or a hoge relation, in which case SECONDARY brings little control. > I chose to implement MAIN_RELATION_CLEANUP within vacuum_rel() instead > of expand_vacuum_rel()/get_all_vacuum_rels(). This allows us to reuse > most of the existing code with minimal changes, and it avoids adding > complexity to the lookups and ownership checks in expand_vacuum_rel() > and get_all_vacuum_rels() (especially the partition lookup logic). > The main tradeoffs of this approach are that we will still create a > transaction for the main relation and that we will still lock the main > relation. Yeah, likely we should not make things more confusing in this area. This was tricky enough to deal with with the recent VACUUM refactoring for multiple relations. > I reused the existing VACOPT_SKIPTOAST option to implement > SECONDARY_RELATION_CLEANUP. This option is currently only used for > autovacuum. My take would be to rename this option, and reuse it for consistency. > I chose to disallow disabling both *_RELATION_CLEANUP options > together, as this would essentially cause the VACUUM command to take > no action. My first reaction is why? Agreed that it is a bit crazy to combine both options, but if you add the argument related to more relation types like toast.. -- Michael
Вложения
Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP optionsto VACUUM
От
"Bossart, Nathan"
Дата:
On 1/21/20, 1:39 PM, "Vik Fearing" <vik.fearing@2ndquadrant.com> wrote: > On 21/01/2020 22:21, Bossart, Nathan wrote: >> I've attached a patch for a couple of new options for VACUUM: >> MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP. The motive >> behind these options is to allow table owners to easily vacuum only >> the TOAST table or only the main relation. This is especially useful >> for TOAST tables since roles do not have access to the pg_toast schema >> by default and some users may find it difficult to discover the name >> of a relation's TOAST table. > > > Could you explain why one would want to do this? Autovacuum will > already deal with the tables separately as needed, but I don't see when > a manual vacuum would want to make this distinction. The main use case I'm targeting is when the level of bloat or transaction ages of a relation and its TOAST table have significantly diverged. In these scenarios, it could be beneficial to be able to vacuum just one or the other, especially if the tables are large. Nathan
Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP optionsto VACUUM
От
"Bossart, Nathan"
Дата:
Hi Michael, Thanks for taking a look. On 1/21/20, 9:02 PM, "Michael Paquier" <michael@paquier.xyz> wrote: > On Tue, Jan 21, 2020 at 09:21:46PM +0000, Bossart, Nathan wrote: >> I've attached a patch for a couple of new options for VACUUM: >> MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP. The motive >> behind these options is to allow table owners to easily vacuum only >> the TOAST table or only the main relation. This is especially useful >> for TOAST tables since roles do not have access to the pg_toast schema >> by default and some users may find it difficult to discover the name >> of a relation's TOAST table. Next, I will explain a couple of the >> main design decisions. > > So that's similar to the autovacuum reloptions, but to be able to > enforce one policy or another manually. Any issues with autovacuum > not able to keep up the bloat pace and where you need to issue manual > VACUUMs in periods of low activity, like nightly VACUUMs? There have been a couple of occasions where I have seen the TOAST table become the most bloated part of the relation. When this happens, it would be handy to be able to avoid scanning the heap and indexes. I am not aware of any concrete problems with autovacuum other than needing to tune the parameters for certain workloads. >> I chose to call the option SECONDARY_RELATION_CLEANUP instead of >> something like TOAST_TABLE_CLEANUP for two reasons. First, other >> types of secondary relations may be added in the future, and it may be >> convenient to put them under the umbrella of this option. Second, it >> seemed like it could be outside of the project's style to use the name >> of internal storage mechanisms in a user-facing VACUUM option. >> However, I am not wedded to the chosen name, as I am sure there are >> good arguments for something like TOAST_TABLE_CLEANUP. > > If other types of relations are added in the future, wouldn't it make > sense to have one switch for each one of those types then? A relation > could have a toast relation associated to it, as much as a foo > relation or a hoge relation, in which case SECONDARY brings little > control. This is a good point. I've renamed the option to TOAST_TABLE_CLEANUP in v2. >> I chose to implement MAIN_RELATION_CLEANUP within vacuum_rel() instead >> of expand_vacuum_rel()/get_all_vacuum_rels(). This allows us to reuse >> most of the existing code with minimal changes, and it avoids adding >> complexity to the lookups and ownership checks in expand_vacuum_rel() >> and get_all_vacuum_rels() (especially the partition lookup logic). >> The main tradeoffs of this approach are that we will still create a >> transaction for the main relation and that we will still lock the main >> relation. > > Yeah, likely we should not make things more confusing in this area. > This was tricky enough to deal with with the recent VACUUM > refactoring for multiple relations. Finding a way to avoid the lock on the main relation could be a future improvement, as that would allow you to manually vacuum both the main relation and its TOAST table in parallel. >> I reused the existing VACOPT_SKIPTOAST option to implement >> SECONDARY_RELATION_CLEANUP. This option is currently only used for >> autovacuum. > > My take would be to rename this option, and reuse it for consistency. Done. >> I chose to disallow disabling both *_RELATION_CLEANUP options >> together, as this would essentially cause the VACUUM command to take >> no action. > > My first reaction is why? Agreed that it is a bit crazy to combine > both options, but if you add the argument related to more relation > types like toast.. Yes, I suppose we have the same problem if you disable MAIN_RELATION_CLEANUP and the relation has no TOAST table. In any case, allowing both options to be disabled shouldn't hurt anything. Nathan
Вложения
Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP optionsto VACUUM
От
Michael Paquier
Дата:
On Fri, Jan 24, 2020 at 09:31:26PM +0000, Bossart, Nathan wrote: > On 1/21/20, 9:02 PM, "Michael Paquier" <michael@paquier.xyz> wrote: >> On Tue, Jan 21, 2020 at 09:21:46PM +0000, Bossart, Nathan wrote: >>> I've attached a patch for a couple of new options for VACUUM: >>> MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP. The motive >>> behind these options is to allow table owners to easily vacuum only >>> the TOAST table or only the main relation. This is especially useful >>> for TOAST tables since roles do not have access to the pg_toast schema >>> by default and some users may find it difficult to discover the name >>> of a relation's TOAST table. Next, I will explain a couple of the >>> main design decisions. >> >> So that's similar to the autovacuum reloptions, but to be able to >> enforce one policy or another manually. Any issues with autovacuum >> not able to keep up the bloat pace and where you need to issue manual >> VACUUMs in periods of low activity, like nightly VACUUMs? > > There have been a couple of occasions where I have seen the TOAST > table become the most bloated part of the relation. When this > happens, it would be handy to be able to avoid scanning the heap and > indexes. I am not aware of any concrete problems with autovacuum > other than needing to tune the parameters for certain workloads. That's something I have faced as well. I have some applications around here where toast tables were the most bloated, and the vacuuming of the main relation ate time, putting more pressure on the vacuuming of the toast relation. So that's a fair argument in my opinion. >>> I chose to implement MAIN_RELATION_CLEANUP within vacuum_rel() instead >>> of expand_vacuum_rel()/get_all_vacuum_rels(). This allows us to reuse >>> most of the existing code with minimal changes, and it avoids adding >>> complexity to the lookups and ownership checks in expand_vacuum_rel() >>> and get_all_vacuum_rels() (especially the partition lookup logic). >>> The main tradeoffs of this approach are that we will still create a >>> transaction for the main relation and that we will still lock the main >>> relation. >> >> Yeah, likely we should not make things more confusing in this area. >> This was tricky enough to deal with with the recent VACUUM >> refactoring for multiple relations. > > Finding a way to avoid the lock on the main relation could be a future > improvement, as that would allow you to manually vacuum both the main > relation and its TOAST table in parallel. I am not sure that we actually need that at all, any catalog changes take a lock on the parent relation first, and that's the conflicts we are looking at here with a share update exclusive lock. -- Michael
Вложения
Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP optionsto VACUUM
От
"Bossart, Nathan"
Дата:
On 1/24/20, 2:14 PM, "Bossart, Nathan" <bossartn@amazon.com> wrote: >> Yes, I suppose we have the same problem if you disable >> MAIN_RELATION_CLEANUP and the relation has no TOAST table. In any >> case, allowing both options to be disabled shouldn't hurt anything. > > I've been thinking further in this area, and I'm wondering if it also > makes sense to remove the restriction on ANALYZE with > MAIN_RELATION_CLEANUP disabled. A command like > > VACUUM (ANALYZE, MAIN_RELATION_CLEANUP FALSE) test; > > could be interpreted as meaning we should vacuum the TOAST table and > analyze the main relation. Since the word "cleanup" is present in the > option name, this might not be too confusing. I've attached v3 of the patch, which removes the restriction on ANALYZE with MAIN_RELATION_CLEANUP disabled. Nathan
Вложения
Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP optionsto VACUUM
От
"Bossart, Nathan"
Дата:
Here is a rebased version of the patch. Nathan
Вложения
Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options to VACUUM
От
Justin Pryzby
Дата:
On Sun, May 31, 2020 at 10:13:39PM +0000, Bossart, Nathan wrote: > Here is a rebased version of the patch. Should bin/vacuumdb support this? Should vacuumdb have a way to pass an arbitrary option to the server, instead of tacking on options (which are frequently forgotten on the initial commit to the backend VACUUM command) ? That has the advantage that vacuumdb could use new options even when connecting to a new server version than client. I think it would be safe as long as it avoided characters like ')' and ';'. Maybe all that's needed is isdigit() || isalpha() || isspace() || c=='_' + MAIN_RELATION_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ] + TOAST_TABLE_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ] Maybe should be called TOAST_RELATION_CLEANUP See attached. -- Justin
Вложения
Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options to VACUUM
От
"Bossart, Nathan"
Дата:
Hi, Thanks for taking a look. On 7/13/20, 11:02 AM, "Justin Pryzby" <pryzby@telsasoft.com> wrote: > Should bin/vacuumdb support this? Yes, it should. I've added it in v5 of the patch. > Should vacuumdb have a way to pass an arbitrary option to the server, instead > of tacking on options (which are frequently forgotten on the initial commit to > the backend VACUUM command) ? That has the advantage that vacuumdb could use > new options even when connecting to a new server version than client. I think > it would be safe as long as it avoided characters like ')' and ';'. Maybe > all that's needed is isdigit() || isalpha() || isspace() || c=='_' I like the idea of allowing users to specify arbitrary options so that they are not constrained to the options in the version of vacuumdb they are using. I suspect we will still want to keep the vacuumdb options updated for consistency and ease-of-use, though. IMO this deserves its own thread. > + MAIN_RELATION_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ] > + TOAST_TABLE_CLEANUP [ <replaceable class="parameter">boolean</replaceable> ] > > Maybe should be called TOAST_RELATION_CLEANUP While using "relation" would be more consistent with the MAIN_RELATION_CLEANUP option, I initially chose "table" for consistency with most of the documentation [0]. Thinking further, I believe this is still the right choice. While the term "relation" refers to any type of object tracked in pg_class [1], a TOAST table can only ever be a TOAST table. There are no other special TOAST relation types (e.g. sequences, materialized views). On the other hand, it is possible to vacuum other types of "main relations" besides regular tables (e.g. materialized views), so MAIN_RELATION_CLEANUP also seems right to me. Thoughts? Nathan [0] https://www.postgresql.org/docs/devel/storage-toast.html [1] https://www.postgresql.org/docs/devel/catalog-pg-class.html
Вложения
RE: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options to VACUUM
От
"k.jamison@fujitsu.com"
Дата:
On Tuesday, July 14, 2020 3:01 AM (GMT+9), Bossart, Nathan wrote: Hi Nathan, >On 7/13/20, 11:02 AM, "Justin Pryzby" <pryzby(at)telsasoft(dot)com> wrote: >> Should bin/vacuumdb support this? > >Yes, it should. I've added it in v5 of the patch. Thank you for the updated patch. I've joined as a reviewer. I've also noticed that you have incorporated Justin's suggested vacuumdb support in the recent patch, but in my opinion it'd be better to split them for better readability. According to the cfbot, patch applies cleanly and passes all the tests. [Use Case] >The main use case I'm targeting is when the level of bloat or >transaction ages of a relation and its TOAST table have significantly >diverged. In these scenarios, it could be beneficial to be able to >vacuum just one or the other, especially if the tables are large. >... >I reused the existing VACOPT_SKIPTOAST option to implement >SECONDARY_RELATION_CLEANUP. This option is currently only used for >autovacuum. Perhaps this has not gathered much attention yet because it's not experienced by many, but I don't see any problem with the additional options on manual VACUUM on top of existing autovacuum cleanups. And I think this is useful for the special use case mentioned, especially that toast table access is not in public as per role limitation. [Docs] I also agree with "TOAST_TABLE_CLEANUP" and just name the options after the respective proposed relation types in the future. + <term><literal>MAIN_RELATION_CLEANUP</literal></term> + <listitem> + <para> + Specifies that <command>VACUUM</command> should attempt to process the + main relation. This is normally the desired behavior and is the default. + Setting this option to false may be useful when it is necessary to only + vacuum a relation's corresponding <literal>TOAST</literal> table. Perhaps it's just my own opinion, but I think the word "process" is vague for a beginner in postgres reading the documents. OTOH, I know it's also used in the source code, so I guess it's just the convention. And "process" is intuititve as "processing tables". Anyway, just my 2 cents & isn't a strong opinion. Also, there's an extra space between the 1st and 2nd sentences. + <term><literal>TOAST_TABLE_CLEANUP</literal></term> + <listitem> + <para> + Specifies that <command>VACUUM</command> should attempt to process the + corresponding <literal>TOAST</literal> table for each relation, if one + exists. This is normally the desired behavior and is the default. + Setting this option to false may be useful when it is necessary to only + vacuum the main relation. This option cannot be disabled when the + <literal>FULL</literal> option is used. Same comments as above, & extra spaces in between the sentences. @@ -1841,9 +1865,16 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params) /* * Remember the relation's TOAST relation for later, if the caller asked * us to process it. In VACUUM FULL, though, the toast table is - * automatically rebuilt by cluster_rel so we shouldn't recurse to it. + * automatically rebuilt by cluster_rel, so we shouldn't recurse to it + * unless MAIN_RELATION_CLEANUP is disabled. The additional last line is a bit confusing (and may be unnecessary/unrelated). To clarify this thread on VACUUM FULL and my understanding of revised vacuum_rel below, we allow MAIN_RELATION_CLEANUP option to be disabled (skip processing main relation) and TOAST_TABLE_CLEANUP should be disabled because cluster_rel() will process the toast table anyway. Is my understanding correct? If yes, then maybe "unless" should be "even if" instead, or we can just remove the line. static bool -vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params) +vacuum_rel(Oid relid, + RangeVar *relation, + VacuumParams *params, + bool processing_toast_table) { ... + bool process_toast; ... - if (!(params->options & VACOPT_SKIPTOAST) && !(params->options & VACOPT_FULL)) + process_toast = (params->options & VACOPT_TOAST_CLEANUP) != 0; + + if ((params->options & VACOPT_FULL) != 0 && + (params->options & VACOPT_MAIN_REL_CLEANUP) != 0) + process_toast = false; + + if (process_toast) toast_relid = onerel->rd_rel->reltoastrelid; else toast_relid = InvalidOid; ... * Do the actual work --- either FULL or "lazy" vacuum + * + * We skip this part if we're processing the main relation and + * MAIN_RELATION_CLEANUP has been disabled. */ - if (params->options & VACOPT_FULL) + if ((params->options & VACOPT_MAIN_REL_CLEANUP) != 0 || + processing_toast_table) ... if (toast_relid != InvalidOid) - vacuum_rel(toast_relid, NULL, params); + vacuum_rel(toast_relid, NULL, params, true); >I've attached v3 of the patch, which removes the restriction on >ANALYZE with MAIN_RELATION_CLEANUP disabled. I've also confirmed those through regression + tap test in my own env and they've passed. I'll look into deeply again if I find problems. I think this follows the similar course of previously added VACUUM and vacuummdb options (for using and skipping truncate, index cleanup, etc.), so the patch seems almost plausible enough for me. Regards, Kirk
Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options to VACUUM
От
Michael Paquier
Дата:
On Tue, Jul 14, 2020 at 05:34:01AM +0000, k.jamison@fujitsu.com wrote: > I've also confirmed those through regression + tap test in my own env > and they've passed. I'll look into deeply again if I find problems. + VACOPT_TOAST_CLEANUP = 1 << 6, /* process TOAST table, if any */ + VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7, /* don't skip any pages */ + VACOPT_MAIN_REL_CLEANUP = 1 << 8 /* process main relation */ } VacuumOption; Do we actually need this much complication in the option set? It is possible to vacuum directly a toast table by passing directly its relation name, with pg_toast as schema, so you can already vacuum a toast relation without the main part. And I would guess that users caring about the toast table specifically would know already how to do that, even if it requires a simple script and a query on pg_class. Now there is a second part, where we'd like to vacuum the main relation but not its toast table. My feeling by looking at this patch today is that we could just make VACOPT_SKIPTOAST an option available at user-level, and support all the cases discussed on this thread. And we have already all the code in place to support that in the backend for autovacuum as relations are processed individually, without their toast tables if they have one. > I think this follows the similar course of previously added VACUUM and > vacuummdb options (for using and skipping truncate, index cleanup, etc.), > so the patch seems almost plausible enough for me. -static bool vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params); +static bool vacuum_rel(Oid relid, + RangeVar *relation, + VacuumParams *params, + bool processing_toast_table); Not much a fan of the addition of this parameter on this routine to track down if the call should process a toast relation or not. Couldn't you just prevent the call to vacuum_rel() to happen at all? -- Michael
Вложения
Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options to VACUUM
От
"Bossart, Nathan"
Дата:
On 8/2/20, 11:47 PM, "Michael Paquier" <michael@paquier.xyz> wrote: > + VACOPT_TOAST_CLEANUP = 1 << 6, /* process TOAST table, if any */ > + VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7, /* don't skip any pages */ > + VACOPT_MAIN_REL_CLEANUP = 1 << 8 /* process main relation */ > } VacuumOption; > > Do we actually need this much complication in the option set? It is > possible to vacuum directly a toast table by passing directly its > relation name, with pg_toast as schema, so you can already vacuum a > toast relation without the main part. And I would guess that users > caring about the toast table specifically would know already how to do > that, even if it requires a simple script and a query on pg_class. > Now there is a second part, where we'd like to vacuum the main > relation but not its toast table. My feeling by looking at this patch > today is that we could just make VACOPT_SKIPTOAST an option available > at user-level, and support all the cases discussed on this thread. > And we have already all the code in place to support that in the > backend for autovacuum as relations are processed individually, > without their toast tables if they have one. My main motive for adding the MAIN_RELATION_CLEANUP option is to allow table owners to easily vacuum only a relation's TOAST table. Roles do not have access to the pg_toast schema by default, so they might be restricted from vacuuming their TOAST tables directly. > -static bool vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params); > +static bool vacuum_rel(Oid relid, > + RangeVar *relation, > + VacuumParams *params, > + bool processing_toast_table); > > Not much a fan of the addition of this parameter on this routine to > track down if the call should process a toast relation or not. > Couldn't you just prevent the call to vacuum_rel() to happen at all? I think it would be possible to skip calling vacuum_rel() from expand_vacuum_rel()/get_all_vacuum_rels() as appropriate, but when I looked into that approach originally, I was concerned that it would add complexity to the lookups and ownership checks (especially the partition lookup logic). The main tradeoffs of the approach I went with are that we still create a transaction for the main relation and that we still lock the main relation. Nathan
Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options to VACUUM
От
Masahiko Sawada
Дата:
On Mon, 3 Aug 2020 at 15:47, Michael Paquier <michael@paquier.xyz> wrote: > > On Tue, Jul 14, 2020 at 05:34:01AM +0000, k.jamison@fujitsu.com wrote: > > I've also confirmed those through regression + tap test in my own env > > and they've passed. I'll look into deeply again if I find problems. > > + VACOPT_TOAST_CLEANUP = 1 << 6, /* process TOAST table, if any */ > + VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7, /* don't skip any pages */ > + VACOPT_MAIN_REL_CLEANUP = 1 << 8 /* process main relation */ > } VacuumOption; > > Do we actually need this much complication in the option set? It is > possible to vacuum directly a toast table by passing directly its > relation name, with pg_toast as schema, so you can already vacuum a > toast relation without the main part. And I would guess that users > caring about the toast table specifically would know already how to do > that, even if it requires a simple script and a query on pg_class. Yeah, I also doubt we really need to have this option in the core just for the purpose of easily specifying toast relation to VACUUM command. If the user doesn't know how to search the toast relation, I think we can provide a script or an SQL function executes vacuum() C function with the toast relation fetched by using the main relation. I personally think VACUUM option basically should be present to control the vacuum internal behavior granularly that the user cannot control from outside, although there are some exceptions: FREEZE and ANALYZE. Regards, -- Masahiko Sawada http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options to VACUUM
От
Michael Paquier
Дата:
On Wed, Aug 05, 2020 at 12:56:48AM +0000, Bossart, Nathan wrote: > My main motive for adding the MAIN_RELATION_CLEANUP option is to allow > table owners to easily vacuum only a relation's TOAST table. Roles do > not have access to the pg_toast schema by default, so they might be > restricted from vacuuming their TOAST tables directly. True that you need an extra GRANT USAGE ON pg_toast to achieve that for users with no privileges, but that's not impossible now either. I am not sure that this use-case justifies a new option and more complications in the code paths of vacuum though. So let's see first if others have an opinion to offer. -- Michael
Вложения
Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options to VACUUM
От
Michael Paquier
Дата:
On Thu, Aug 06, 2020 at 11:50:06AM +0900, Michael Paquier wrote: > True that you need an extra GRANT USAGE ON pg_toast to achieve that > for users with no privileges, but that's not impossible now either. I > am not sure that this use-case justifies a new option and more > complications in the code paths of vacuum though. So let's see first > if others have an opinion to offer. Seeing nothing happening here, I am marking the CF entry as returned with feedback. FWIW, I still tend to think that we could call this stuff a day if we had an option to skip a toast relation when willing to vacuum the parent relation. -- Michael
Вложения
Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options to VACUUM
От
Justin Pryzby
Дата:
On Fri, Jan 24, 2020 at 09:24:45PM +0000, Bossart, Nathan wrote: > On 1/21/20, 1:39 PM, "Vik Fearing" <vik.fearing@2ndquadrant.com> wrote: > > On 21/01/2020 22:21, Bossart, Nathan wrote: > >> I've attached a patch for a couple of new options for VACUUM: > >> MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP. The motive > >> behind these options is to allow table owners to easily vacuum only > >> the TOAST table or only the main relation. This is especially useful > >> for TOAST tables since roles do not have access to the pg_toast schema > >> by default and some users may find it difficult to discover the name > >> of a relation's TOAST table. > > > > > > Could you explain why one would want to do this? Autovacuum will > > already deal with the tables separately as needed, but I don't see when > > a manual vacuum would want to make this distinction. > > The main use case I'm targeting is when the level of bloat or > transaction ages of a relation and its TOAST table have significantly > diverged. In these scenarios, it could be beneficial to be able to > vacuum just one or the other, especially if the tables are large. This just came up for me: I have a daily maintenance script which pro-actively vacuums tables: freezing historic partitions, vacuuming current tables if the table's relfrozenxid is old, and to encourage indexonly scan. I'm checking the greatest(age(toast,main)) and vacuum the table (and implicitly its toast) whenever either is getting old. But it'd be more ideal if I could independently vacuum the main table if it's old, but not the toast table. -- Justin
Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options to VACUUM
От
"Bossart, Nathan"
Дата:
On 1/27/21, 11:07 AM, "Justin Pryzby" <pryzby@telsasoft.com> wrote: > This just came up for me: > > I have a daily maintenance script which pro-actively vacuums tables: freezing > historic partitions, vacuuming current tables if the table's relfrozenxid is > old, and to encourage indexonly scan. > > I'm checking the greatest(age(toast,main)) and vacuum the table (and implicitly > its toast) whenever either is getting old. > > But it'd be more ideal if I could independently vacuum the main table if it's > old, but not the toast table. Thanks for chiming in. It looks like we were leaning towards only adding the TOAST_TABLE_CLEANUP option, which is already implemented internally with VACOPT_SKIPTOAST. It's already possible to vacuum a TOAST table directly, so we can probably do without the MAIN_RELATION_CLEANUP option. I've attached a new patch that only adds TOAST_TABLE_CLEANUP. Nathan
Вложения
Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options to VACUUM
От
Justin Pryzby
Дата:
On Wed, Jan 27, 2021 at 11:16:26PM +0000, Bossart, Nathan wrote: > On 1/27/21, 11:07 AM, "Justin Pryzby" <pryzby@telsasoft.com> wrote: > > This just came up for me: > > > > I have a daily maintenance script which pro-actively vacuums tables: freezing > > historic partitions, vacuuming current tables if the table's relfrozenxid is > > old, and to encourage indexonly scan. > > > > I'm checking the greatest(age(toast,main)) and vacuum the table (and implicitly > > its toast) whenever either is getting old. > > > > But it'd be more ideal if I could independently vacuum the main table if it's > > old, but not the toast table. > > Thanks for chiming in. > > It looks like we were leaning towards only adding the > TOAST_TABLE_CLEANUP option, which is already implemented internally > with VACOPT_SKIPTOAST. It's already possible to vacuum a TOAST table > directly, so we can probably do without the MAIN_RELATION_CLEANUP > option. > > I've attached a new patch that only adds TOAST_TABLE_CLEANUP. Thanks, I wrote my message after running into the issue and remembered this thread. I didn't necessarily mean to send another patch :) My only comment is on the name: TOAST_TABLE_CLEANUP. "Cleanup" suggests that the (main or only) purpose is to "clean" dead tuples to avoid bloat. But in my use case, the main purpose is to avoid XID wraparound (or its warnings). Okay, my second only comment is that this: | This option cannot be disabled when the <literal>FULL</literal> option is | used. Should it instead be ignored if FULL is also specified ? Currently only PARALLEL and DISABLE_PAGE_SKIPPING cause an error when used with FULL. That's documented for PARALLEL, but I think it should also be documented for DISABLE_PAGE_SKIPPING (which is however an advanced option). -- Justin
Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options to VACUUM
От
"Bossart, Nathan"
Дата:
On 1/27/21, 5:08 PM, "Justin Pryzby" <pryzby@telsasoft.com> wrote: > Thanks, I wrote my message after running into the issue and remembered this > thread. I didn't necessarily mean to send another patch :) No worries. I lost track of this thread, but I don't mind picking it up again. > My only comment is on the name: TOAST_TABLE_CLEANUP. "Cleanup" suggests that > the (main or only) purpose is to "clean" dead tuples to avoid bloat. But in my > use case, the main purpose is to avoid XID wraparound (or its warnings). I chose TOAST_TABLE_CLEANUP to match the INDEX_CLEANUP option, but I'm not wedded to that name. What do you think about PROCESS_TOAST_TABLE? > Okay, my second only comment is that this: > > | This option cannot be disabled when the <literal>FULL</literal> option is > | used. > > Should it instead be ignored if FULL is also specified ? Currently only > PARALLEL and DISABLE_PAGE_SKIPPING cause an error when used with FULL. That's > documented for PARALLEL, but I think it should also be documented for > DISABLE_PAGE_SKIPPING (which is however an advanced option). IMO we should emit an ERROR in this case. If we ignored it, we'd end up processing the TOAST table even though the user asked us to skip it. Nathan
Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options to VACUUM
От
Michael Paquier
Дата:
On Thu, Jan 28, 2021 at 06:16:09PM +0000, Bossart, Nathan wrote: > I chose TOAST_TABLE_CLEANUP to match the INDEX_CLEANUP option, but I'm > not wedded to that name. What do you think about PROCESS_TOAST_TABLE? Most of the other options use a verb, so using PROCESS, or even SKIP sounds like a good idea. More ideas: PROCESS_TOAST, SKIP_TOAST. I don't like much the term CLEANUP here, as it may imply, at least to me, that the toast relation is getting partially processed. > IMO we should emit an ERROR in this case. If we ignored it, we'd end > up processing the TOAST table even though the user asked us to skip > it. Issuing an error makes the most sense to me per the argument based on cluster_rel() and copy_table_data(). Silently ignoring options can be confusing for the end-user. + <para> + Do not clean up the TOAST table. + </para> Is that enough? I would say instead: "Skip the TOAST table associated to the table to vacuum, if any." -- Michael
Вложения
Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options to VACUUM
От
"Bossart, Nathan"
Дата:
On 1/28/21, 11:15 PM, "Michael Paquier" <michael@paquier.xyz> wrote: > On Thu, Jan 28, 2021 at 06:16:09PM +0000, Bossart, Nathan wrote: >> I chose TOAST_TABLE_CLEANUP to match the INDEX_CLEANUP option, but I'm >> not wedded to that name. What do you think about PROCESS_TOAST_TABLE? > > Most of the other options use a verb, so using PROCESS, or even SKIP > sounds like a good idea. More ideas: PROCESS_TOAST, SKIP_TOAST. I > don't like much the term CLEANUP here, as it may imply, at least to > me, that the toast relation is getting partially processed. I changed it to PROCESS_TOAST. > + <para> > + Do not clean up the TOAST table. > + </para> > Is that enough? I would say instead: "Skip the TOAST table associated > to the table to vacuum, if any." Done. Nathan
Вложения
Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options to VACUUM
От
Michael Paquier
Дата:
On Fri, Jan 29, 2021 at 06:43:44PM +0000, Bossart, Nathan wrote: > I changed it to PROCESS_TOAST. Thanks. PROCESS_TOAST sounds good to me at the end for the option name, so let's just go with that. > Done. While on it, I could not resist with changing VACOPT_SKIPTOAST to VACOPT_PROCESS_TOAST on consistency grounds. This is used only in four places in the code, so that's not invasive. What do you think? -- Michael
Вложения
Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options to VACUUM
От
Justin Pryzby
Дата:
On Mon, Feb 08, 2021 at 04:35:19PM +0900, Michael Paquier wrote: > On Fri, Jan 29, 2021 at 06:43:44PM +0000, Bossart, Nathan wrote: > > I changed it to PROCESS_TOAST. > > Thanks. PROCESS_TOAST sounds good to me at the end for the option > name, so let's just go with that. > > > Done. > > While on it, I could not resist with changing VACOPT_SKIPTOAST to > VACOPT_PROCESS_TOAST on consistency grounds. This is used only in > four places in the code, so that's not invasive. +1 > @@ -971,6 +998,7 @@ help(const char *progname) > printf(_(" --min-mxid-age=MXID_AGE minimum multixact ID age of tables to vacuum\n")); > printf(_(" --min-xid-age=XID_AGE minimum transaction ID age of tables to vacuum\n")); > printf(_(" --no-index-cleanup don't remove index entries that point to dead tuples\n")); > + printf(_(" --no-process-toast skip the TOAST table associated to the table to vacuum, if any\n")); say "associated WITH" > + corresponding <literal>TOAST</literal> table for each relation, if one > + exists. This is normally the desired behavior and is the default. > + Setting this option to false may be useful when it is necessary to only Maybe it should say "when it is only necessary to" But what you've written isn't wrong, depending on what you mean. > @@ -244,6 +244,21 @@ PostgreSQL documentation > + Skip the TOAST table associated to the table to vacuum, if any. associatd with -- Justin
Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options to VACUUM
От
"Bossart, Nathan"
Дата:
On 2/8/21, 12:47 AM, "Justin Pryzby" <pryzby@telsasoft.com> wrote: > On Mon, Feb 08, 2021 at 04:35:19PM +0900, Michael Paquier wrote: >> On Fri, Jan 29, 2021 at 06:43:44PM +0000, Bossart, Nathan wrote: >> > I changed it to PROCESS_TOAST. >> >> Thanks. PROCESS_TOAST sounds good to me at the end for the option >> name, so let's just go with that. >> >> > Done. >> >> While on it, I could not resist with changing VACOPT_SKIPTOAST to >> VACOPT_PROCESS_TOAST on consistency grounds. This is used only in >> four places in the code, so that's not invasive. > > +1 +1 >> @@ -971,6 +998,7 @@ help(const char *progname) >> printf(_(" --min-mxid-age=MXID_AGE minimum multixact ID age of tables to vacuum\n")); >> printf(_(" --min-xid-age=XID_AGE minimum transaction ID age of tables to vacuum\n")); >> printf(_(" --no-index-cleanup don't remove index entries that point to dead tuples\n")); >> + printf(_(" --no-process-toast skip the TOAST table associated to the table to vacuum, if any\n")); > > say "associated WITH" > >> + corresponding <literal>TOAST</literal> table for each relation, if one >> + exists. This is normally the desired behavior and is the default. >> + Setting this option to false may be useful when it is necessary to only > > Maybe it should say "when it is only necessary to" > But what you've written isn't wrong, depending on what you mean. > >> @@ -244,6 +244,21 @@ PostgreSQL documentation >> + Skip the TOAST table associated to the table to vacuum, if any. > > associatd with These suggestions seem reasonable to me. I've applied them in v9. Nathan
Вложения
Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options to VACUUM
От
Michael Paquier
Дата:
On Mon, Feb 08, 2021 at 06:59:45PM +0000, Bossart, Nathan wrote: > These suggestions seem reasonable to me. I've applied them in v9. Sounds good to me, so applied. -- Michael
Вложения
Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options to VACUUM
От
"Bossart, Nathan"
Дата:
On 2/8/21, 9:19 PM, "Michael Paquier" <michael@paquier.xyz> wrote: > On Mon, Feb 08, 2021 at 06:59:45PM +0000, Bossart, Nathan wrote: >> These suggestions seem reasonable to me. I've applied them in v9. > > Sounds good to me, so applied. Thanks! Nathan