Re: Improve planner cost estimations for alternative subplans

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: Improve planner cost estimations for alternative subplans
Дата
Msg-id 20200620233030.jcxdjd6njwaajzrr@development
обсуждение исходный текст
Ответ на Re: Improve planner cost estimations for alternative subplans  (Melanie Plageman <melanieplageman@gmail.com>)
Ответы Re: Improve planner cost estimations for alternative subplans  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On Wed, Jun 17, 2020 at 06:21:58PM -0700, Melanie Plageman wrote:
>On Fri, Jun 5, 2020 at 9:08 AM Alexey Bashtanov <bashtanov@imap.cc> wrote:
>
>>
>> In [1] we found a situation where it leads to a suboptimal plan,
>> as it bloats the overall cost into large figures,
>> a decision related to an outer part of the plan look negligible to the
>> planner,
>> and as a result it doesn't elaborate on choosing the optimal one.
>>
>>
>Did this geometric average method result in choosing the desired plan for
>this case?
>
>
>> The patch is to fix it. Our linear model for costs cannot quite accommodate
>> the piecewise linear matter of alternative subplans,
>> so it is based on ugly heuristics and still cannot be very precise,
>> but I think it's better than the current one.
>>
>> Thoughts?
>>
>>
>Is there another place in planner where two alternatives are averaged
>together and that cost is used?
>
>To me, it feels a little bit weird that we are averaging together the
>startup cost of a plan which will always have a 0 startup cost and a
>plan that will always have a non-zero startup cost and the per tuple
>cost of a plan that will always have a negligible per tuple cost and one
>that might have a very large per tuple cost.
>
>I guess it feels different because instead of comparing alternatives you
>are blending them.
>
>I don't have any academic basis for saying that the alternatives costs
>shouldn't be averaged together for use in the rest of the plan, so I
>could definitely be wrong.
>

I agree it feels weird. Even if it actually improved the problematic
case, I think it'll be quite hard to convince ourselves this helps in
general. For example, for cases that actually end up using the first
plan, this is bound to make the estimates worse. I find it hard to
believe it won't cause regressions in at least some cases.

Maybe this heuristics really is better than the old one, but I think we
need to understand why - a single query probably is not enough.

I think the crucial limitation here is that we don't know which of the
alternative plans will be used. Is there a chance to improve this,
perhaps by making some sort of guess?

I'm not particularly familiar with AlternativeSubPlans, but I see we're
picking the one in nodeSubplan.c based on plan_rows. Can't we do the
same thing in cost_qual_eval_walker?


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Justin Pryzby
Дата:
Сообщение: Re: Operator class parameters and sgml docs
Следующее
От: Thomas Munro
Дата:
Сообщение: Re: pg_regress cleans up tablespace twice.