Re: asynchronous and vectorized execution

Поиск

Список

Период

Сортировка

От	Robert Haas
Тема	Re: asynchronous and vectorized execution
Дата	11 мая 2016 г. 17:21:43
Msg-id	CA+TgmobwSEndyr669qpyN_u4XhkPo_2C1BAs2ydb22T4niv_aQ@mail.gmail.com обсуждение исходный текст
Ответ на	Re: asynchronous and vectorized execution (Andres Freund <andres@anarazel.de>)
Список	pgsql-hackers

Дерево обсуждения

On Tue, May 10, 2016 at 8:23 PM, Andres Freund <andres@anarazel.de> wrote:
>> c. Modify some nodes (perhaps start with nodeAgg.c) to allow them to
>> process a batch TupleTableSlot. This will require some tight loop to
>> aggregate the entire TupleTableSlot at once before returning.
>> d. Add function in execAmi.c which returns true or false depending on
>> if the node supports batch TupleTableSlots or not.
>> e. At executor startup determine if the entire plan tree supports
>> batch TupleTableSlots, if so enable batch scan mode.
>
> It doesn't really need to be the entire tree. Even if you have a subtree
> (say a parametrized index nested loop join) which doesn't support batch
> mode, you'll likely still see performance benefits by building a batch
> one layer above the non-batch-supporting node.

+1.

I've also wondered about building a new executor node that is sort of
a combination of Nested Loop and Hash Join, but capable of performing
multiple joins in a single operation. (Merge Join is different,
because it's actually matching up the two sides, not just doing
probing once per outer tuple.) So the plan tree would look something
like this:

Multiway Join
-> Seq Scan on driving_table
-> Index Scan on something
-> Index Scan on something_else
-> Hash -> Seq Scan on other_thing
-> Hash -> Seq Scan on other_thing_2
-> Index Scan on another_one

With the current structure, every level of the plan tree has its own
TupleTableSlot and we have to project into each new slot.  Every level
has to go through ExecProcNode.  So it seems to me that this sort of
structure might save quite a few cycles on deep join nests.  I haven't
tried it, though.

With batching, things get even better for this sort of thing.
Assuming the joins are all basically semi-joins, either because they
were written that way or because they are probing unique indexes or
whatever, you can fetch a batch of tuples from the driving table, do
the first join for each tuple to create a matching batch of tuples,
and repeat for each join step.  Then at the end you project.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Bruce Momjian
Дата: 11 мая 2016 г., 17:20:22
Сообщение: Academic help for Postgres

Следующее

От: Konstantin Knizhnik
Дата: 11 мая 2016 г., 17:23:26
Сообщение: Re: asynchronous and vectorized execution

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: asynchronous and vectorized execution

Предыдущее

Следующее