Re: feature request: explain "with details" option

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: feature request: explain "with details" option
Дата
Msg-id CA+TgmoYhG2ULEv6D2gAZwPuEnCCw-xDTScqhD1hrR-beK1wwDg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: feature request: explain "with details" option  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: feature request: explain "with details" option  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On Fri, Sep 9, 2016 at 12:35 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Craig Ringer <craig@2ndquadrant.com> writes:
>> On 9 September 2016 at 01:40, Roger Pack <rogerdpack2@gmail.com> wrote:
>>> Today's explain tells us what loops and scans were used, and relative
>>> costs, etc.  It doesn't seem to tell *why* the planner elected to use
>>> what it did.
>
>> One thing that's been discussed here is to have a way to see which
>> potential plans are rejected and compare their costs.
>
>> This isn't simple because there are often *lots* of variants. You
>> don't just want to see the "top 10" candidate plans, because they're
>> probably a bunch of small variants on the same plan; the ones you'll
>> be interested in will probably be very different plans with very bad
>> relative estimates.
>
> The other big problem here is that the planner tries *very* hard to reject
> losing paths early; so it does not even form an explainable plan for a
> large fraction of the search space.  (And if it did, you'd be dead before
> you got your EXPLAIN result back.)
>
> People have experimented with making the planner log every candidate path
> before the path enters the comparison tournament (and, typically, doesn't
> survive the first round).  But I've never seen any version of that that
> I thought would be intelligible to non-experts.  It's exceedingly verbose
> and it certainly doesn't look anything like what we know as EXPLAIN output.

What I've observed when troubleshooting plan selection is that you
often want to change the planner's choice in one particular part of
the plan and see what happens - e.g. force a {sequential scan, index
scan using index X, bitmap index scan using index X, index-only scan
using index X} on a particular base relation, or force a {hash, merge,
nested loop} join between X and Y, possibly constraining which is the
inner side and which is the outer side.  Or, also fairly commonly, I
just want to know what other paths were generated at a given level of
the plan tree and what their estimate costs were.

Of course, at the risk of starting a flame war, query hints would be
rather useful here.  You'd be able to force the plan you want not
because forcing a plan choice is a great way to run a production
system (though somebody might want to do that, of course) but to debug
why you're not getting that plan.  Right now, the only tools we have
for this sort of thing are the enable_* GUCs and twiddling the cost
values, and that's OK for simple plans but for complex plans involving
multiple tables it's a pretty blunt instrument and it's often tedious
to understand exactly what made the planner do what it did.  I for one
would welcome a better way to elicit EXPLAIN (THE_PLAN_I_WANTED).

Another thing that would be useful is, for each base relation, save
all of the access paths and the costs of each; and for each join
relation, save the cheapest cost for each join method.  So if you have
a join between A, B, and C, you can see all of the possible access
methods and their costs for A, B, and C; plus, for the join order
actually chosen (but not any alternative join order), you can see
whether other join methods were judged feasible and if so what their
cost would have been given the actually-surviving paths for the
underlying relations.  So for a two-way join you might get something
like:

Hash Join Considered:   Merge Join (cost=...)   Nested Loop (cost=...) -> Seq Scan     Considered:         Index Scan
on... (cost=...) -> Hash   -> Seq Scan      Considered:         Index Scan on .... (cost=...)
 

Of course, this wouldn't tell you everything you could possibly want
to know, but it would let you answer a lot of common questions like
"how much slower would it have been to use the index" (or "the other
index") and "were other join methods considered too expensive or were
they not even considered because the planner thinks they're not usable
here?" and "how much more expensive does the planner think that a hash
join would have been than the nested loop actually chosen?".

I think the only part of this that would be really brutal to try to
represent is alternative join orders.  I see no reasonable way for
EXPLAIN to output useful information about what other join orders were
considered and why they were not chosen; the only thing that seems
like it would help in that case would be an easy way to force the
exact join order you want and then see what the plan looks like.  Even
that's not totally perfect because sometimes there are a bunch of join
orders that are essentially interchangeable and what you really want
to know is whether the planner considered a join order that's
materially different, but the planner makes no such distinction
internally.  At any rate, I don't think the fact that it's difficult
or impossible to provide information about join orders should deter us
from having a way to display the stuff we can get our hands around.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Petr Jelinek
Дата:
Сообщение: Re: Logical Replication WIP
Следующее
От: Andres Freund
Дата:
Сообщение: Re: Logical Replication WIP