Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly

Поиск
Список
Период
Сортировка
От Alexey Kondratov
Тема Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly
Дата
Msg-id bac6739d47f053fe04fbd8cf05283137@postgrespro.ru
обсуждение исходный текст
Ответ на Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly  (Alexey Kondratov <a.kondratov@postgrespro.ru>)
Ответы Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly  (Alexey Kondratov <a.kondratov@postgrespro.ru>)
Список pgsql-hackers
On 2021-01-21 04:41, Michael Paquier wrote:
> On Wed, Jan 20, 2021 at 03:34:39PM -0300, Alvaro Herrera wrote:
>> On 2021-Jan-20, Alexey Kondratov wrote:
>>> Ugh, forgot to attach the patches. Here they are.
>> 
>> Yeah, looks reasonable.
> 
>>> +
>>> +    if (changed)
>>> +        /* Record dependency on tablespace */
>>> +        changeDependencyOnTablespace(RelationRelationId,
>>> +                                     reloid, rd_rel->reltablespace);
>> 
>> Why have a separate "if (changed)" block here instead of merging with
>> the above?
> 
> Yep.
> 

Sure, this is a refactoring artifact.

> +       if (SetRelTablespace(reloid, newTableSpace))
> +               /* Make sure the reltablespace change is visible */
> +               CommandCounterIncrement();
> At quick glance, I am wondering why you just don't do a CCI within
> SetRelTablespace().
> 

I did it that way for a better readability at first, since it looks more 
natural when you do some change (SetRelTablespace) and then make them 
visible with CCI. Second argument was that in the case of 
reindex_index() we have to also call RelationAssumeNewRelfilenode() and 
RelationDropStorage() before doing CCI and making the new tablespace 
visible. And this part is critical, I guess.

> 
> +      This specifies that indexes will be rebuilt on a new tablespace.
> +      Cannot be used with "mapped" relations. If 
> <literal>SCHEMA</literal>,
> +      <literal>DATABASE</literal> or <literal>SYSTEM</literal> is
> specified, then
> +      all unsuitable relations will be skipped and a single
> <literal>WARNING</literal>
> +      will be generated.
> What is an unsuitable relation?  How can the end user know that?
> 

This was referring to mapped relations mentioned in the previous 
sentence. I have tried to rewrite this part and make it more specific in 
my current version. Also added Justin's changes to the docs and comment.

> This is missing ACL checks when moving the index into a new location,
> so this requires some pg_tablespace_aclcheck() calls, and the other
> patches share the same issue.
> 

I added proper pg_tablespace_aclcheck()'s into the reindex_index() and 
ReindexPartitions().

> +       else if (partkind == RELKIND_PARTITIONED_TABLE)
> +       {
> +           Relation rel = table_open(partoid, ShareLock);
> +           List    *indexIds = RelationGetIndexList(rel);
> +           ListCell *lc;
> +
> +           table_close(rel, NoLock);
> +           foreach (lc, indexIds)
> +           {
> +               Oid indexid = lfirst_oid(lc);
> +               (void) set_rel_tablespace(indexid, 
> params->tablespaceOid);
> +           }
> +       }
> This is really a good question.  ReindexPartitions() would trigger one
> transaction per leaf to work on.  Changing the tablespace of the
> partitioned table(s) before doing any work has the advantage to tell
> any new partition to use the new tablespace.  Now, I see a struggling
> point here: what should we do if the processing fails in the middle of
> the move, leaving a portion of the leaves in the previous tablespace?
> On a follow-up reindex with the same command, should the command force
> a reindex even on the partitions that have been moved?  Or could there
> be a point in skipping the partitions that are already on the new
> tablespace and only process the ones on the previous tablespace?  It
> seems to me that the first scenario makes the most sense as currently
> a REINDEX works on all the relations defined, though there could be
> use cases for the second case.  This should be documented, I think.
> 

I agree that follow-up REINDEX should also reindex moved partitions, 
since REINDEX (TABLESPACE ...) is still reindex at first. I will try to 
put something about this part into the docs. Also I think that we cannot 
be sure that nothing happened with already reindexed partitions between 
two consequent REINDEX calls.

> There are no tests for partitioned tables, aka we'd want to make sure
> that the new partitioned index is on the correct tablespace, as well
> as all its leaves.  It may be better to have at least two levels of
> partitioned tables, as well as a partitioned table with no leaves in
> the cases dealt with.
> 

Yes, sure, it makes sense.

> +        *
> +        * Even if a table's indexes were moved to a new tablespace, 
> the index
> +        * on its toast table is not normally moved.
>          */
> Still, REINDEX (TABLESPACE) TABLE should move all of them to be
> consistent with ALTER TABLE SET TABLESPACE, but that's not the case
> with this code, no?  This requires proper test coverage, but there is
> nothing of the kind in this patch.

You are right, we do not move TOAST indexes now, since 
IsSystemRelation() is true for TOAST indexes, so I thought that we 
should not allow moving them without allow_system_table_mods=true. Now I 
wonder why ALTER TABLE does that.

I am going to attach the new version of patch set today or tomorrow.


Regards
-- 
Alexey Kondratov

Postgres Professional https://www.postgrespro.com
Russian Postgres Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Pavel Stehule
Дата:
Сообщение: Re: patch: reduce overhead of execution of CALL statement in no atomic mode from PL/pgSQL
Следующее
От: Pavel Stehule
Дата:
Сообщение: Re: patch: reduce overhead of execution of CALL statement in no atomic mode from PL/pgSQL