Re: Why could GEQO produce plans with lower costs than thestandard_join_search?

Поиск
Список
Период
Сортировка
От Donald Dong
Тема Re: Why could GEQO produce plans with lower costs than thestandard_join_search?
Дата
Msg-id 5DB4E603-09E2-4EEE-A769-411367C8E32D@csumb.edu
обсуждение исходный текст
Ответ на Re: Why could GEQO produce plans with lower costs than the standard_join_search?  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Why could GEQO produce plans with lower costs than the standard_join_search?  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On May 23, 2019, at 10:43 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Donald Dong <xdong@csumb.edu> writes:
>> On May 23, 2019, at 9:02 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> (2) the paths you show do not correspond to the finally selected
>>> plans --- they aren't even the same shape.  (The Gathers are in
>>> different places, to start with.)  I'm not sure where you were
>>> capturing the path data, but it looks like you missed top-level
>>> parallel-aggregation planning, and that managed to find some
>>> plans that were marginally cheaper than the ones you captured.
>>> Keep in mind that GEQO only considers join planning, not
>>> grouping/aggregation.
> 
>> By looking at the captured costs, I thought GEQO found a better join
>> order than the standard_join_search. However, the final plan using
>> the join order produced by GEQO turns out to be more expansive. Would
>> that imply if GEQO sees a join order which is identical to the one
>> produced by standard_join_search, it will discard it since the
>> cheapest_total_path has a higher cost, though the final plan may be
>> cheaper?
> 
> I suspect what's really going on is that you're looking at the wrong
> paths.  The planner remembers more paths for each rel than just the
> cheapest-total-cost one, the reason being that total cost is not the
> only figure of merit.  The plan that is winning in the end, it looks
> like, is parallelized aggregation on top of a non-parallel join plan,
> but the cheapest_total_path uses up the opportunity for a Gather on
> a parallelized scan/join.  If we were just doing a scan/join and
> no aggregation, that path would have been the basis for the final
> plan, but it's evidently not being chosen here; the planner is going
> to some other scan/join path that is not parallelized.

Seems the paths in the final rel (path list, cheapest parameterized
paths, cheapest startup path, and cheapest total path)  are the same
identical path for this particular query (JOB/1a.sql). Am I missing
anything?

Since the total cost of the cheapest-total-path is what GEQO is
currently using to evaluate the fitness (minimizing), I'm expecting
the cheapest-total-cost to measure how good is a join order. So a
join order from standard_join_search, with higher
cheapest-total-cost, ends up to be better is pretty surprising to me.

Perhaps the cheapest-total-cost should not be the best/only choice
for fitness?

Regards,
Donald Dong



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Top-N sorts in EXPLAIN, row count estimates, and parallelism
Следующее
От: Tom Lane
Дата:
Сообщение: Re: SQL-spec incompatibilities in similar_escape() and related stuff