Re: [HACKERS] parallelize queries containing subplans

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: [HACKERS] parallelize queries containing subplans
Дата
Msg-id CAA4eK1LQA-avGD=H3nNfVC8nMahozgRzEHb3KPpeppCTQ-gayg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] parallelize queries containing subplans  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: [HACKERS] parallelize queries containing subplans  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On Tue, Jan 3, 2017 at 4:19 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Wed, Dec 28, 2016 at 11:47 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> Now, we can further extend this to parallelize queries containing
>> correlated subplans like below:
>>
>> explain select * from t1 where t1.i in (select t2.i from t2 where t2.i=t1.i);
>>                          QUERY PLAN
>> -------------------------------------------------------------
>>  Seq Scan on t1  (cost=0.00..831049.09 rows=8395 width=12)
>>    Filter: (SubPlan 1)
>>    SubPlan 1
>>      ->  Seq Scan on t2  (cost=0.00..97.73 rows=493 width=4)
>>            Filter: (i = t1.i)
>> (5 rows)
>>
>> In the above query, Filter on t2 (i = t1.i) generates Param node which
>> is a parallel-restricted node, so such queries won't be able to use
>> parallelism even with the patch.  I think we can mark Params which
>> refer to same level as parallel-safe and I think we have this
>> information (node-> varlevelsup/ phlevelsup/ agglevelsup) available
>> when we replace correlation vars (SS_replace_correlation_vars).
>>
>
> I have implemented the above idea which will allow same or immediate
> outer level PARAMS as parallel_safe.  The results of above query after
> patch:
>
> postgres=# explain select * from t1 where t1.i in (select t2.i from t2
> where t2.i=t1.i);
>                                 QUERY PLAN
> --------------------------------------------------------------------------
>  Gather  (cost=0.00..488889.88 rows=8395 width=12)
>    Workers Planned: 1
>    ->  Parallel Seq Scan on t1  (cost=0.00..488889.88 rows=4938 width=12)
>          Filter: (SubPlan 1)
>          SubPlan 1
>            ->  Seq Scan on t2  (cost=0.00..97.73 rows=493 width=4)
>                  Filter: (i = t1.i)
> (7 rows)
>

On further evaluation, it seems this patch has one big problem which
is that it will allow forming parallel plans which can't be supported
with current infrastructure.  For ex. marking immediate level params
as parallel safe can generate below type of plan:

Seq Scan on t1  Filter: (SubPlan 1)  SubPlan 1    ->  Gather          Workers Planned: 1          ->  Result
   One-Time Filter: (t1.k = 0)                ->  Parallel Seq Scan on t2
 


In this plan, we can't evaluate one-time filter (that contains
correlated param) unless we have the capability to pass all kind of
PARAM_EXEC param to workers.   I don't want to invest too much time in
this patch unless somebody can see some way using current parallel
infrastructure to implement correlated subplans.

Note that still, the other patch [1] in this thread which implements
parallelism for uncorrelated subplan holds good.


[1] - https://www.postgresql.org/message-id/CAA4eK1J9mDZLcp-OskkdzAf_yT8W4dBSGL9E%3DkoEiJkdpVZsEA%40mail.gmail.com

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Seki, Eiji"
Дата:
Сообщение: Re: [HACKERS] Proposal: GetOldestXminExtend for ignoring arbitraryvacuum flags
Следующее
От: Corey Huinker
Дата:
Сообщение: Re: \if, \elseif, \else, \endif (was Re: [HACKERS] PSQL commands:\quit_if, \quit_unless)