Re: Parallel INSERT (INTO ... SELECT ...)

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: Parallel INSERT (INTO ... SELECT ...)
Дата
Msg-id CAA4eK1LYM-h33KYv3JrGX+1OnxJ1Hxo-_NXpjYidO-bduOjzpQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Parallel INSERT (INTO ... SELECT ...)  (Greg Nancarrow <gregn4422@gmail.com>)
Ответы Re: Parallel INSERT (INTO ... SELECT ...)  (Greg Nancarrow <gregn4422@gmail.com>)
Список pgsql-hackers
On Thu, Oct 22, 2020 at 9:47 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Fri, Oct 16, 2020 at 9:26 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
>
> Posting an update to the smaller patch (Parallel SELECT for INSERT
> INTO...SELECT...).
>
> Most of this patch feeds into the larger Parallel INSERT patch, for
> which I'll also be posting an update soon.
>
> Patch updates include:
> - Removed explicit trigger-type checks (instead rely on declared
> trigger parallel safety)
> - Restored parallel-related XID checks that previous patch altered;
> now assign XID prior to entering parallel-mode
> - Now considers parallel-SELECT for parallel RESTRICTED cases (not
> just parallel SAFE cases)
> - Added parallel-safety checks for partition key expressions and
> support functions
> - Workaround added for test failure in "partition-concurrent-attach"
> test;
>

IIUC, below is code for this workaround:

+MaxRelParallelHazardForModify(Oid relid,
+ CmdType commandType,
+ max_parallel_hazard_context *context)
+{
+ Relation        rel;
+ TupleDesc tupdesc;
+ int attnum;
+
+ LOCKMODE lockmode = AccessShareLock;
+
+ /*
+ * It's possible that this relation is locked for exclusive access
+ * in another concurrent transaction (e.g. as a result of a
+ * ALTER TABLE ... operation) until that transaction completes.
+ * If a share-lock can't be acquired on it now, we have to assume this
+ * could be the worst-case, so to avoid blocking here until that
+ * transaction completes, conditionally try to acquire the lock and
+ * assume and return UNSAFE on failure.
+ */
+ if (ConditionalLockRelationOid(relid, lockmode))
+ {
+ rel = table_open(relid, NoLock);
+ }
+ else
+ {
+ context->max_hazard = PROPARALLEL_UNSAFE;
+ return context->max_hazard;
+ }

Do we need this workaround if we lock just the parent table instead of
locking all the tables? Basically, can we safely identify the
parallel-safety of partitioned relation if we just have a lock on
parent relation? One more thing I have noticed is that for scan
relations (Select query), we do such checks much later based on
RelOptInfo (see set_rel_consider_parallel) which seems to have most of
the information required to perform parallel-safety checks but I guess
for ModifyTable (aka the Insert table) the equivalent doesn't seem
feasible but have you thought of doing at the later stage in planner?

Few other comments on latest patch:
===============================
1.
MaxRelParallelHazardForModify()
{
..
+ if (commandType == CMD_INSERT || commandType == CMD_UPDATE)
+ {
+ /*
..

Why to check CMD_UPDATE here?

2.
+void PrepareParallelModeForModify(CmdType commandType, bool
isParallelModifyLeader)
+{
+ Assert(!IsInParallelMode());
+
+ if (isParallelModifyLeader)
+ (void)GetCurrentCommandId(true);
+
+ (void)GetCurrentFullTransactionId();

Here, we should use GetCurrentTransactionId() similar to heap_insert
or other heap operations. I am not sure why you have used
GetCurrentFullTransactionId?

3. Can we have a test to show why we need to check all the partitions
for parallel-safety? I think it would be possible when there is a
trigger on only one of the partitions and that trigger has
corresponding parallel_unsafe function. But it is good to verify that
once.

4. Have you checked the overhead of this on the planner for different
kinds of statements like inserts into tables having 100 or 500
partitions? Similarly, it is good to check the overhead of domain
related checks added in the patch.

5. Can we have a separate patch for parallel-selects for Insert? It
will make review easier.

-- 
With Regards,
Amit Kapila.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Dilip Kumar
Дата:
Сообщение: Re: Re: parallel distinct union and aggregate support patch
Следующее
От: Peter Eisentraut
Дата:
Сообщение: Re: pg_upgrade analyze script