Re: [sqlsmith] Failed assertion in parallel worker (ExecInitSubPlan)

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: [sqlsmith] Failed assertion in parallel worker (ExecInitSubPlan)
Дата
Msg-id CAA4eK1Ky2=HsTsT4hmfL=EAL5rv0_t59tvWzVK9HQKvN6Dovkw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [sqlsmith] Failed assertion in parallel worker (ExecInitSubPlan)  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: [sqlsmith] Failed assertion in parallel worker (ExecInitSubPlan)  (Amit Kapila <amit.kapila16@gmail.com>)
Re: [sqlsmith] Failed assertion in parallel worker (ExecInitSubPlan)  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On Fri, May 6, 2016 at 8:45 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Andreas Seltenreich <seltenreich@gmx.de> writes:
> > when fuzz testing master as of c1543a8, parallel workers trigger the
> > following assertion in ExecInitSubPlan every couple hours.
> >     TRAP: FailedAssertion("!(list != ((List *) ((void *)0)))", File: "list.c", Line: 390)
> > Sample backtraces of a worker and leader below, plan of leader attached.
> > The collected queries don't seem to reproduce it.
>
> Odd.  My understanding of the restrictions on parallel query is that
> anything involving a SubPlan ought not be parallelized;
>

Subplan references are considered parallel-restricted, so parallel plan can be generated if there are subplans in a query, but they shouldn't be pushed to workers.  I have tried a somewhat simpler example to see if we pushdown anything parallel restricted to worker in case of joins and it turned out there are cases when that can happen.  Consider below example:

create or replace function parallel_func_select() returns integer
as $$
declare
    ret_val int;
begin
     ret_val := 1000;
     return ret_val;
end;
$$ language plpgsql Parallel Restricted;

CREATE TABLE t1(c1, c2) AS SELECT g, repeat('x', 5) FROM
generate_series(1, 10000000) g;

CREATE TABLE t2(c1, c2) AS SELECT g, repeat('x', 5) FROM
generate_series(1, 1000000) g;

Explain Verbose SELECT t1.c1 + parallel_func_select(), t2.c1 FROM t1 JOIN t2 ON t1.c1 = t2.c1;

                                       QUERY PLAN

--------------------------------------------------------------------------------
--------
 Gather  (cost=32813.00..537284.53 rows=1000000 width=8)
   Output: ((t1.c1 + parallel_func_select())), t2.c1
   Workers Planned: 2
   ->  Hash Join  (cost=31813.00..436284.53 rows=1000000 width=8)
         Output: (t1.c1 + parallel_func_select()), t2.c1
         Hash Cond: (t1.c1 = t2.c1)
         ->  Parallel Seq Scan on public.t1  (cost=0.00..95721.08 rows=4166608 w
idth=4)
               Output: t1.c1, t1.c2
         ->  Hash  (cost=15406.00..15406.00 rows=1000000 width=4)
               Output: t2.c1
               ->  Seq Scan on public.t2  (cost=0.00..15406.00 rows=1000000 widt
h=4)
                     Output: t2.c1
(12 rows)


From the above output it is clear that parallel restricted function is pushed down below gather node.  I found that though we have have care fully avoided to push pathtarget below GatherPath in apply_projection_to_path() if pathtarget contains any parallel unsafe or parallel restricted clause, but we are separately also trying to apply pathtarget to partialpath list which doesn't seem to be the correct way even if it is required.  I think this has been added during parallel aggregate patch and it seems to me this is not required after the changes related to GatherPath in apply_projection_to_path().

After applying the attached patch, it avoids to add parallel restricted clauses below gather path.

Now back to the original bug, if you notice in plan file attached in original bug report, the subplan is pushed below Gather node in target list, but not to immediate join, rather at one more level down to SeqScan path.  I am still not sure how it has manage to push the restricted clauses to that down the level.

Thoughts?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Simon Riggs
Дата:
Сообщение: Re: pg9.6 segfault using simple query (related to use fk for join estimates)
Следующее
От: Stephen Frost
Дата:
Сообщение: Re: [COMMITTERS] pgsql: Add TAP tests for pg_dump