Re: [HACKERS] parallelize queries containing initplans

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: [HACKERS] parallelize queries containing initplans
Дата
Msg-id CAA4eK1KThoT0choY6QYaNyP+oCRS27MRA+0r8rQt69Oauz_OtQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] parallelize queries containing initplans  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: [HACKERS] parallelize queries containing initplans  (Amit Kapila <amit.kapila16@gmail.com>)
Re: [HACKERS] parallelize queries containing initplans  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
On Wed, Oct 4, 2017 at 3:40 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Oct 3, 2017 at 7:33 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> Having said all that, I think that this patch only wants to handle the
> subset of cases (2) and (4) where the relevant InitPlan is attached
> ABOVE the Gather node -- which seems very reasonable, because
> evaluating an InitPlan at a level of the plan tree above the level at
> which it is defined sounds like it might be complex.  But I still
> don't quite see why we need these tests.  I mean, if we only allow
> Param references from a set of safe parameter IDs, and the safe
> parameter IDs include only those IDs that can be generated in a
> worker, then won't other InitPlans in the workers anyway be ruled out?

It doesn't happen always.  There are cases when the part of required
conditions are pushed one query level below, so when we check during
max_parallel_hazard_walker, they look safe, but actually, they are
not.  I will try to explain by example:

postgres=# explain (costs off, verbose) select * from t1 where t1.i in
( select 1 + (select max(j) from t3));                             QUERY PLAN
----------------------------------------------------------------------Hash Semi Join  Output: t1.i, t1.j, t1.k  Hash
Cond:(t1.i = ((1 + $1)))  ->  Seq Scan on public.t1        Output: t1.i, t1.j, t1.k  ->  Hash        Output: ((1 + $1))
      ->  Result              Output: (1 + $1)              InitPlan 1 (returns $1)                ->  Finalize
Aggregate                     Output: max(t3.j)                      ->  Gather                            Output:
(PARTIALmax(t3.j))                            Workers Planned: 2                            ->  Partial Aggregate
                          Output: PARTIAL max(t3.j)                                  ->  Parallel Seq Scan on public.t3
                                      Output: t3.j
 
(19 rows)

In the above example, you can see that the condition referring to
initplan (1 + $1) is pushed one level below.  So when it tries to
check parallel safety for the join condition, it won't see Param node.
Now, consider if we don't check contains_parallel_unsafe_param during
generate_gather_paths, then it will lead to below plan.


postgres=# explain (costs off, verbose) select * from t1 where t1.i in
( select 1 + (select max(j) from t3));                                QUERY PLAN
----------------------------------------------------------------------------Gather  Output: t1.i, t1.j, t1.k  Workers
Planned:2  ->  Hash Semi Join        Output: t1.i, t1.j, t1.k        Hash Cond: (t1.i = ((1 + $1)))        ->  Parallel
SeqScan on public.t1              Output: t1.i, t1.j, t1.k        ->  Hash              Output: ((1 + $1))
-> Result                    Output: (1 + $1)                    InitPlan 1 (returns $1)                      ->
FinalizeAggregate                            Output: max(t3.j)                            ->  Gather
             Output: (PARTIAL max(t3.j))                                  Workers Planned: 2
     ->  Partial Aggregate                                        Output: PARTIAL max(t3.j)
          ->  Parallel Seq Scan on public.t3                                              Output: t3.j
 
(22 rows)

This is wrong because when we will try to evaluate params that are
required at gather node, we won't get the required param as there is
no initplan at that level.

>
> If I am all mixed up, please help straighten me out.
>

I think whatever you said is right and very clear.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Laurenz Albe
Дата:
Сообщение: Re: [HACKERS] list of credits for release notes
Следующее
От: Amit Kapila
Дата:
Сообщение: Re: [HACKERS] parallelize queries containing initplans