Re: [HACKERS] parallelize queries containing initplans

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: [HACKERS] parallelize queries containing initplans
Дата
Msg-id CAA4eK1+16-HBmY4N0feB1CmyG=R3rB0DiT2_h3SgdMypS4SG1A@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] parallelize queries containing initplans  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: [HACKERS] parallelize queries containing initplans  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
On Tue, Jan 31, 2017 at 4:16 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Wed, Dec 28, 2016 at 5:20 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
>>   The drawback of the second approach is
>> that we need to evaluate the initplan before it is actually required
>> which means that we might evaluate it even when it is not required.  I
>> am not sure if it is always safe to assume that we can evaluate the
>> initplan before pushing it to workers especially for the cases when it
>> is far enough down in the plan tree which we are parallelizing,
>>
>
> I think we can always pull up un-correlated initplans at Gather node,
> however, if there is a correlated initplan, then it is better not to
> allow such initplans for being pushed below gather.  Ex. of correlated
> initplans:
>
> postgres=# explain (costs off) select * from t1 where t1.i in (select
> t2.i from t2 where t1.k = (select max(k) from t3 where t3.i=t1.i));
>                   QUERY PLAN
> ----------------------------------------------
>  Seq Scan on t1
>    Filter: (SubPlan 2)
>    SubPlan 2
>      ->  Gather
>            Workers Planned: 1
>            Params Evaluated: $1
>            InitPlan 1 (returns $1)
>              ->  Aggregate
>                    ->  Seq Scan on t3
>                          Filter: (i = t1.i)
>            ->  Result
>                  One-Time Filter: (t1.k = $1)
>                  ->  Parallel Seq Scan on t2
> (13 rows)
>
> It might be safe to allow above plan, but in general, such plans
> should not be allowed, because it might not be feasible to compute
> such initplan references at Gather node.  I am still thinking on the
> best way to deal with such initplans.
>

I could see two possibilities to determine whether the plan (for which
we are going to generate an initplan) contains a reference to a
correlated var param node.  One is to write a plan or path walker to
determine any such reference and the second is to keep the information
about the correlated param in path node.   I think the drawback of the
first approach is that traversing path tree during generation of
initplan can be costly, so for now I have kept the information in path
node to prohibit generating parallel initplans which contain a
reference to correlated vars. I think we can go with first approach of
using path walker if people feel that is better than maintaining a
reference in path.  Attached patch
prohibit_parallel_correl_params_v1.patch implements the second
approach of keeping the correlated var param reference in path node
and pq_pushdown_initplan_v2.patch uses that to generate parallel
initplans.

Thoughts?

These patches build on top of parallel subplan patch [1].

[1] - https://www.postgresql.org/message-id/CAA4eK1KYQjQzQMpEz+QRA2fmim386gQLQBEf+p2Wmtqjh1rjwg@mail.gmail.com


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Kuntal Ghosh
Дата:
Сообщение: Re: [HACKERS] GUC for cleanup indexes threshold.
Следующее
От: Robert Haas
Дата:
Сообщение: [HACKERS] removing tsearch2