Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

Поиск

Список

Период

Сортировка

От	Rushabh Lathia
Тема	Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)
Дата	19 сентября 2017 г. 13:21:29
Msg-id	CAGPqQf3a79OOVX=N3eqY37aqVVw3kQdKs+RQxv8D1ga8=XM2mg@mail.gmail.com обсуждение исходный текст
Ответ на	Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation) (Thomas Munro <thomas.munro@enterprisedb.com>)
Ответы	Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)
Список	pgsql-hackers

Дерево обсуждения

On Wed, Mar 22, 2017 at 3:19 AM, Thomas Munro <thomas.munro@enterprisedb.com> wrote:

On Wed, Mar 22, 2017 at 10:03 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Mar 21, 2017 at 3:50 PM, Peter Geoghegan <pg@bowt.ie> wrote:
>> I disagree with that. It is a
>> trade-off, I suppose. I have now run out of time to work through it
>> with you or Thomas, though.
>
> Bummer.

I'm going to experiment with refactoring the v10 parallel CREATE INDEX
patch to use the SharedBufFileSet interface from
hj-shared-buf-file-v8.patch today and see what problems I run into.

As per the earlier discussion in the thread, I did experiment using
BufFileSet interface from parallel-hash-v18.patchset. I took the reference
of parallel-hash other patches to understand the BufFileSet APIs, and
incorporate the changes to parallel create index.

In order to achieve the same:

- Applied 0007-Remove-BufFile-s-isTemp-flag.patch and
0008-Add-BufFileSet-for-sharing-temporary-files-between-b.patch from the
parallel-hash-v18.patchset.
- Removed the buffile.c/logtap.c/fd.c changes from the parallel CREATE
INDEX v10 patch.
- incorporate the BufFileSet API to the parallel tuple sort for CREATE INDEX.
- Changes into few existing functions as well as added few to support the
BufFileSet changes.

To check the performance, I used the similar test which Peter posted in
his earlier thread. which is:

Machine: power2 machine with 512GB of RAM

Setup:

CREATE TABLE parallel_sort_test AS
    SELECT hashint8(i) randint,
    md5(i::text) collate "C" padding1,
    md5(i::text || '2') collate "C" padding2
    FROM generate_series(0, 1e9::bigint) i;

vacuum ANALYZE parallel_sort_test;

postgres=# show max_parallel_workers_per_gather;
max_parallel_workers_per_gather
---------------------------------
8
(1 row)

postgres=# show maintenance_work_mem;
maintenance_work_mem
----------------------
8GB
(1 row)

postgres=# show max_wal_size ;
max_wal_size
--------------
4GB
(1 row)

CREATE INDEX serial_idx ON parallel_sort_test (randint);

Without patch:

Time: 3430054.220 ms (57:10.054)

With patch (max_parallel_workers_maintenance = 8):

Time: 1163445.271 ms (19:23.445)

Thanks to my colleague Thomas Munro for his help and off-line discussion
for the patch.

Attaching v11 patch and trace_sort output for the test.

Thanks,

Rushabh Lathia

www.EnterpriseDB.com

Вложения

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

Вложения