Re: Parallel copy

Поиск

Список

Период

Сортировка

От	Ashutosh Sharma
Тема	Re: Parallel copy
Дата	23 июля 2020 г. 07:51:12
Msg-id	CAE9k0PkY1cT2Ax9B4TrYHCPw_YNibWJQ0wBNiPDTXpQ0_aXS0Q@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Parallel copy (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Список	pgsql-hackers

Дерево обсуждения

I think, when doing the performance testing for partitioned table, it would be good to also mention about the distribution of data in the input file. One possible data distribution could be that we have let's say 100 tuples in the input file, and every consecutive tuple belongs to a different partition.

On Thu, Jul 23, 2020 at 8:51 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:

On Wed, Jul 22, 2020 at 7:56 PM vignesh C <vignesh21@gmail.com> wrote:
>
> Thanks for reviewing and providing the comments Ashutosh.
> Please find my thoughts below:
>
> On Fri, Jul 17, 2020 at 7:18 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> >
> > Some review comments (mostly) from the leader side code changes:
> >
> > 3) Should we allow Parallel Copy when the insert method is CIM_MULTI_CONDITIONAL?
> >
> > + /* Check if the insertion mode is single. */
> > + if (FindInsertMethod(cstate) == CIM_SINGLE)
> > + return false;
> >
> > I know we have added checks in CopyFrom() to ensure that if any trigger (before row or instead of) is found on any of partition being loaded with data, then COPY FROM operation would fail, but does it mean that we are okay to perform parallel copy on partitioned table. Have we done some performance testing with the partitioned table where the data in the input file needs to be routed to the different partitions?
> >
>
> Partition data is handled like what Amit had told in one of earlier mails [1]. My colleague Bharath has run performance test with partition table, he will be sharing the results.
>

I ran tests for partitioned use cases - results are similar to that of non partitioned cases[1].

parallel workers test case 1(exec time in sec): copy from csv file, 5.1GB, 10million tuples, 4 range partitions, 3 indexes on integer columns unique data test case 2(exec time in sec): copy from csv file, 5.1GB, 10million tuples, 4 range partitions, unique data
0 205.403(1X) 135(1X)
2 114.724(1.79X) 59.388(2.27X)
4 99.017(2.07X) 56.742(2.34X)
8 99.722(2.06X) 66.323(2.03X)
16 98.147(2.09X) 66.054(2.04X)
20 97.723(2.1X) 66.389(2.03X)
30 97.048(2.11X) 70.568(1.91X)

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

В списке pgsql-hackers по дате отправления:

Предыдущее

От: "tsunakawa.takay@fujitsu.com"
Дата: 23 июля 2020 г., 07:46:30
Сообщение: RE: Global snapshots

Следующее

От: Amit Kapila
Дата: 23 июля 2020 г., 09:01:58
Сообщение: Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

parallel workers	test case 1(exec time in sec): copy from csv file, 5.1GB, 10million tuples, 4 range partitions, 3 indexes on integer columns unique data	test case 2(exec time in sec): copy from csv file, 5.1GB, 10million tuples, 4 range partitions, unique data
0	205.403(1X)	135(1X)
2	114.724(1.79X)	59.388(2.27X)
4	99.017(2.07X)	56.742(2.34X)
8	99.722(2.06X)	66.323(2.03X)
16	98.147(2.09X)	66.054(2.04X)
20	97.723(2.1X)	66.389(2.03X)
30	97.048(2.11X)	70.568(1.91X)

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Parallel copy

Предыдущее

Следующее