Re: Using multiple extended statistics for estimates

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: Using multiple extended statistics for estimates
Дата
Msg-id 20191114215125.awaa77vsihszguqk@development
обсуждение исходный текст
Ответ на Re: Using multiple extended statistics for estimates  (Mark Dilger <hornschnorter@gmail.com>)
Список pgsql-hackers
On Thu, Nov 14, 2019 at 01:17:02PM -0800, Mark Dilger wrote:
>
>
>On 11/14/19 12:04 PM, Tomas Vondra wrote:
>>Aha, I think I understand now - thanks for the explanation. You're right
>>the comment is trying to explain why just taking the last clause for a
>>given attnum is fine. I'll try to make the comment clearer.
>>
>>For the case with equal Const values that should be mostly obvious, i.e.
>>"a=1 AND a=1 AND a=1" has the same selectivity as "a=1".
>>
>>The case with different Const values is harder, unfortunately. It might
>>seem obvious that "a=1 AND a=2" means there are no matching rows, but
>>that heavily relies on the semantics of the equality operator. And we
>>can't simply compare the Const values either, I'm afraid, because there
>>are cases with cross-type operators like
>>
>>  a = 1::int AND a = 1.0::numeric
>>
>>where the Consts are of different type, yet both conditions can be true.
>>
>>So it would be pretty tricky to do this, and the current code does not
>>even try to do that.
>>
>>Instead, it just assumes that it's mostly fine to overestimate, because
>>then at runtime we'll simply end up with 0 rows here.
>
>I'm unsure whether that could be a performance problem at runtime.
>
>I could imagine the planner short-circuiting additional planning when
>it finds a plan with zero rows, and so we'd save planner time if we
>avoid overestimating.  I don't recall if the planner does anything like
>that, or if there are plans to implement such logic, but it might be
>good not to rule it out.  Tom's suggestion elsewhere in this thread to
>use code in predtest.c sounds good to me.
>

No, AFAIK the planner does not do anything like that - it might probaly
do that if it could prove there are no such rows, but that's hardly the
case for estimates based on approximate information (i.e. statistics).

If could do that based on the predicate analysis in predtest.c mentioned
by Tom, although I don't think it does anything beyond tweaking the row
estimate to ~1 row.

>I don't know if you want to expand the scope of this particular patch to
>include that, though.
>

Certainly not. It's an interesting but surprisingly complicated problem,
and this patch simply aims to add different improvement.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tomas Vondra
Дата:
Сообщение: Re: Using multiple extended statistics for estimates
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: format of pg_upgrade loadable_libraries warning