Re: Why could GEQO produce plans with lower costs than the standard_join_search?

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Why could GEQO produce plans with lower costs than the standard_join_search?
Дата
Msg-id 16892.1558633391@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Why could GEQO produce plans with lower costs than thestandard_join_search?  (Donald Dong <xdong@csumb.edu>)
Ответы Re: Why could GEQO produce plans with lower costs than thestandard_join_search?  (Donald Dong <xdong@csumb.edu>)
Список pgsql-hackers
Donald Dong <xdong@csumb.edu> writes:
> On May 23, 2019, at 9:02 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> (2) the paths you show do not correspond to the finally selected
>> plans --- they aren't even the same shape.  (The Gathers are in
>> different places, to start with.)  I'm not sure where you were
>> capturing the path data, but it looks like you missed top-level
>> parallel-aggregation planning, and that managed to find some
>> plans that were marginally cheaper than the ones you captured.
>> Keep in mind that GEQO only considers join planning, not
>> grouping/aggregation.

> By looking at the captured costs, I thought GEQO found a better join
> order than the standard_join_search. However, the final plan using
> the join order produced by GEQO turns out to be more expansive. Would
> that imply if GEQO sees a join order which is identical to the one
> produced by standard_join_search, it will discard it since the
> cheapest_total_path has a higher cost, though the final plan may be
> cheaper?

I suspect what's really going on is that you're looking at the wrong
paths.  The planner remembers more paths for each rel than just the
cheapest-total-cost one, the reason being that total cost is not the
only figure of merit.  The plan that is winning in the end, it looks
like, is parallelized aggregation on top of a non-parallel join plan,
but the cheapest_total_path uses up the opportunity for a Gather on
a parallelized scan/join.  If we were just doing a scan/join and
no aggregation, that path would have been the basis for the final
plan, but it's evidently not being chosen here; the planner is going
to some other scan/join path that is not parallelized.

I haven't looked closely at whether the parallel-query hacking has
paid any attention to GEQO.  It's entirely likely that GEQO is still
choosing its join order on the basis of cheapest-total scan/join cost
without regard to parallelizability, which would lead to an apparently
better cost for the cheapest_total_path even though the path that
will end up being used is some other one.

            regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Khandekar
Дата:
Сообщение: Re: Minimal logical decoding on standbys
Следующее
От: Mark Dilger
Дата:
Сообщение: fsync failure in durable_unlink ignored in xlog.c?