Re: Parallel copy

Поиск

Список

Период

Сортировка

От	Bharath Rupireddy
Тема	Re: Parallel copy
Дата	20 октября 2020 г. 12:55:14
Msg-id	CALj2ACWrQz-=PWc0e5QOwetVNoBOaOTKvTWyz4=2y0=NVOcOcg@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Parallel copy (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Список	pgsql-hackers

Дерево обсуждения

On Fri, Oct 9, 2020 at 2:52 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Tue, Sep 29, 2020 at 6:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > 2. Do we have tests for toast tables? I think if you implement the
> > previous point some existing tests might cover it but I feel we should
> > have at least one or two tests for the same.
> >
> Toast table use case 1: 10000 tuples, 9.6GB data, 3 indexes 2 on integer columns, 1 on text column(not the toast column), csv file, each row is > 1320KB:
> (222.767, 0, 1X), (134.171, 1, 1.66X), (93.749, 2, 2.38X), (93.672, 4, 2.38X), (94.827, 8, 2.35X), (93.766, 16, 2.37X), (98.153, 20, 2.27X), (122.721, 30, 1.81X)
>
> Toast table use case 2: 100000 tuples, 96GB data, 3 indexes 2 on integer columns, 1 on text column(not the toast column), csv file, each row is > 1320KB:
> (2255.032, 0, 1X), (1358.628, 1, 1.66X), (901.170, 2, 2.5X), (912.743, 4, 2.47X), (988.718, 8, 2.28X), (938.000, 16, 2.4X), (997.556, 20, 2.26X), (1000.586, 30, 2.25X)
>
> Toast table use case3: 10000 tuples, 9.6GB, no indexes, binary file, each row is > 1320KB:
> (136.983, 0, 1X), (136.418, 1, 1X), (81.896, 2, 1.66X), (62.929, 4, 2.16X), (52.311, 8, 2.6X), (40.032, 16, 3.49X), (44.097, 20, 3.09X), (62.310, 30, 2.18X)
>
> In the case of a Toast table, we could achieve upto 2.5X for csv files, and 3.5X for binary files. We are analyzing this point and will post an update on our findings soon.
>

I analyzed the above point of getting only upto 2.5X performance improvement for csv files with a toast table with 3 indexers - 2 on integer columns and 1 on text column(not the toast column). Reason is that workers are fast enough to do the work and they are waiting for the leader to fill in the data blocks and in this case the leader is able to serve the workers at its maximum possible speed. Hence most of the time the workers are waiting not doing any beneficial work.

Having observed the above point, I tried to make workers perform more work to avoid waiting time. For this, I added a gist index on the toasted text column. The use and results are as follows.

Toast table use case4: 10000 tuples, 9.6GB, 4 indexes - 2 on integer columns, 1 on non-toasted text column and 1 gist index on toasted text column, csv file, each row is ~ 12.2KB:

(1322.839, 0, 1X), (1261.176, 1, 1.05X), (632.296, 2, 2.09X), (321.941, 4, 4.11X), (181.796, 8, 7.27X), (105.750, 16, 12.51X), (107.099, 20, 12.35X), (123.262, 30, 10.73X)

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

В списке pgsql-hackers по дате отправления:

Предыдущее

От: John Naylor
Дата: 20 октября 2020 г., 12:33:43
Сообщение: Re: speed up unicode normalization quick check

Следующее

От: Peter Eisentraut
Дата: 20 октября 2020 г., 12:57:25
Сообщение: Re: dynamic result sets support in extended query protocol

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Parallel copy

Предыдущее

Следующее