RE: Parallel INSERT (INTO ... SELECT ...)

Поиск
Список
Период
Сортировка
От tsunakawa.takay@fujitsu.com
Тема RE: Parallel INSERT (INTO ... SELECT ...)
Дата
Msg-id OSBPR01MB298274BCB8959932191EA9A6FECB0@OSBPR01MB2982.jpnprd01.prod.outlook.com
обсуждение исходный текст
Ответ на Re: Parallel INSERT (INTO ... SELECT ...)  (Greg Nancarrow <gregn4422@gmail.com>)
Ответы Re: Parallel INSERT (INTO ... SELECT ...)  (Greg Nancarrow <gregn4422@gmail.com>)
Список pgsql-hackers
From: Greg Nancarrow <gregn4422@gmail.com>
> Firstly, in order to perform parallel-safety checks in the case of partitions, the
> patch currently recursively locks/unlocks
> (AccessShareLock) each partition during such checks (as each partition may
> itself be a partitioned table). Is there a better way of performing the
> parallel-safety checks and reducing the locking requirements?

First of all, as you demonstrated the planning time and execution time of parallel insert, I think the increased
planningtime is negligible when the parallel insert is intentionally used for loading large amount of data.  However,
it'sa problem if the overhead is imposed on OLTP transactions.  Does the overhead occur with the default values of
max_parallel_workers_per_gather= 2 and max_parall_workers = 8?
 

To avoid this heavy checking during planning, I'm wondering if we can have an attribute in pg_class, something like
relhasindexesand relhas triggers.  The concerning point is that we have to maintain the accuracy of the value when
droppingancillary objects around the table/partition.
 


> Secondly, I found that when running "make check-world", the
> "partition-concurrent-attach" test fails, because it is expecting a partition
> constraint to be violated on insert, while an "alter table attach partition ..." is
> concurrently being executed in another transaction. Because of the partition
> locking done by the patch's parallel-safety checking code, the insert blocks on
> the exclusive lock held by the "alter table" in the other transaction until the
> transaction ends, so the insert ends up successfully completing (and thus fails
> the test) when the other transaction ends. To overcome this test failure, the
> patch code was updated to instead perform a conditional lock on the partition,
> and on failure (i.e. because of an exclusive lock held somewhere else), just
> assume it's parallel-unsafe because the parallel-safety can't be determined
> without blocking on the lock. This is not ideal, but I'm not sure of what other
> approach could be used and I am somewhat reluctant to change that test. If
> anybody is familiar with the "partition-concurrent-attach" test, any ideas or
> insights would be appreciated.

That test looks sane.  I think what we should do is to disable parallel operation during that test.  It looks like some
ofother existing test cases disable parallel query by setting max_parallel_workers_per_gather to 0.  It's not strange
thatsome tests fail with some configuration.  autovacuum is disabled in many places of the regression test.
 

Rather, I don't think we should introduce the trick to use ConditionalLockAcquire().  Otherwise, the insert would be
executedin a serial fashion without the user knowing it -- "What?  The insert suddenly slowed down multiple times
today,and it didn't finish within the planned maintenance window.  What's wrong?"
 


Regards
Takayuki Tsunakawa


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Zhihong Yu
Дата:
Сообщение: Re: Parallel Inserts in CREATE TABLE AS
Следующее
От: Zhihong Yu
Дата:
Сообщение: Re: Deleting older versions in unique indexes to avoid page splits