Re: Runtime pruning problem

Поиск
Список
Период
Сортировка
От Amit Langote
Тема Re: Runtime pruning problem
Дата
Msg-id CA+HiwqHwGnUuC6-4Lx+xS7Qr6hnqJnRHpapFwpSiDuY-s3Hhxg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Runtime pruning problem  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On Wed, Jul 31, 2019 at 8:31 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> David Rowley <david.rowley@2ndquadrant.com> writes:
> > On Wed, 31 Jul 2019 at 10:56, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> The portion of this below the Append is fine, but I argue that
> >> the Vars above the Append should say "part", not "part_p1".
> >> In that way they'd look the same regardless of which partitions
> >> have been pruned or not.
>
> > That seems perfectly reasonable for Append / MergeAppend that are for
> > scanning partitioned tables. What do you propose we do for inheritance
> > and UNION ALLs?
>
> For inheritance, I don't believe there would be any change, precisely
> because we've historically used the parent rel as reference.

I may be missing something, but Vars above an Append/MergeAppend,
whether it's scanning a partitioned table or a regular inheritance
table, always refer to the first child subplan, which may or may not
be for the inheritance parent in its role as a child, not the Append
parent.

create table parent (a int);
alter table only parent add check (a = 1) no inherit;
create table child1 (a int check (a = 2)) inherits (parent);
create table child2 (a int check (a = 3)) inherits (parent);

explain (costs off, verbose) select * from parent where a > 1 order by 1;
              QUERY PLAN
───────────────────────────────────────
 Sort
   Output: child1.a
   Sort Key: child1.a
   ->  Append
         ->  Seq Scan on public.child1
               Output: child1.a
               Filter: (child1.a > 1)
         ->  Seq Scan on public.child2
               Output: child2.a
               Filter: (child2.a > 1)
(10 rows)

I think this is because we replace the original targetlist of such
nodes by a dummy one using set_dummy_tlist_references(), where all the
parent Vars are re-stamped with OUTER_VAR as varno.  When actually
printing the EXPLAIN VERBOSE output, ruleutils.c considers the first
child of Append as the OUTER referent, as set_deparse_planstate()
states:

    /*
     * We special-case Append and MergeAppend to pretend that the first child
     * plan is the OUTER referent; we have to interpret OUTER Vars in their
     * tlists according to one of the children, and the first one is the most
     * natural choice.

If I change set_append_references() to comment out the
set_dummy_tlist_references() call, I get this output:

explain (costs off, verbose) select * from parent where a > 1 order by 1;
              QUERY PLAN
───────────────────────────────────────
 Sort
   Output: a
   Sort Key: a
   ->  Append
         ->  Seq Scan on public.child1
               Output: child1.a
               Filter: (child1.a > 1)
         ->  Seq Scan on public.child2
               Output: child2.a
               Filter: (child2.a > 1)
(10 rows)

Not parent.a as I had expected.  That seems to be because parent's RTE
is considered unused in the plan.  One might say that the plan's
Append node belongs to that RTE, but then Append doesn't have any RT
index attached to it, so it escapes ExplainPreScanNode()'s walk of the
plan tree to collect the indexes of "used RTEs".  I changed
set_rtable_names() to get around that as follows:

@@ -3458,7 +3458,7 @@ set_rtable_names(deparse_namespace *dpns, List
*parent_namespaces,
         /* Just in case this takes an unreasonable amount of time ... */
         CHECK_FOR_INTERRUPTS();

-        if (rels_used && !bms_is_member(rtindex, rels_used))
+        if (rels_used && !bms_is_member(rtindex, rels_used) && !rte->inh)

and I get:

explain (costs off, verbose) select * from parent where a > 1 order by 1;
              QUERY PLAN
───────────────────────────────────────
 Sort
   Output: parent.a
   Sort Key: parent.a
   ->  Append
         ->  Seq Scan on public.child1
               Output: child1.a
               Filter: (child1.a > 1)
         ->  Seq Scan on public.child2
               Output: child2.a
               Filter: (child2.a > 1)
(10 rows)

> For setops we've traditionally used the left input as reference.
> Maybe we could do better, but I'm not very sure how, since SQL
> doesn't actually provide any explicit names for the setop result.
> Making up a name with no basis in the query probably isn't an
> improvement, or at least not enough of one to justify a change.

I too am not sure what we should about Appends of setops, but with the
above hacks, I get this:

explain (costs off, verbose) select * from child1 union all select *
from child2 order by 1;
              QUERY PLAN
───────────────────────────────────────
 Sort
   Output: "*SELECT* 1".a
   Sort Key: "*SELECT* 1".a
   ->  Append
         ->  Seq Scan on public.child1
               Output: child1.a
         ->  Seq Scan on public.child2
               Output: child2.a
(8 rows)

whereas currently it prints:

explain (costs off, verbose) select * from child1 union all select *
from child2 order by 1;
              QUERY PLAN
───────────────────────────────────────
 Sort
   Output: child1.a
   Sort Key: child1.a
   ->  Append
         ->  Seq Scan on public.child1
               Output: child1.a
         ->  Seq Scan on public.child2
               Output: child2.a
(8 rows)

Thanks,
Amit



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: tap tests driving the database via psql
Следующее
От: Melanie Plageman
Дата:
Сообщение: Re: Avoiding hash join batch explosions with extreme skew and weird stats