Re: PATCH: add support for IN and @> in functional-dependencystatistics use

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: PATCH: add support for IN and @> in functional-dependencystatistics use
Дата
Msg-id 20200318002946.6dvblukm3cfmgir2@development
обсуждение исходный текст
Ответ на Re: PATCH: add support for IN and @> in functional-dependencystatistics use  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Ответы Re: PATCH: add support for IN and @> in functional-dependencystatistics use
Список pgsql-hackers
On Tue, Mar 17, 2020 at 06:05:17PM +0100, Tomas Vondra wrote:
>On Tue, Mar 17, 2020 at 04:14:26PM +0000, Dean Rasheed wrote:
>>On Tue, 17 Mar 2020 at 15:37, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>>>
>>>On Tue, Mar 17, 2020 at 12:42:52PM +0000, Dean Rasheed wrote:
>>>
>>>>The other thing that I'm still concerned about is the possibility of
>>>>returning estimates with P(a,b) > P(a) or P(b). I think that such a
>>>>thing becomes much more likely with the new types of clause supported
>>>>here, because they now allow multiple values from each column, where
>>>>before we only allowed one. I took another look at the patch I posted
>>>>on the other thread, and I've convinced myself that it's correct.
>>>>Attached is an updated version, with some cosmetic tidying up and now
>>>>with some additional regression tests.
>>>
>>>Yeah, I agree that's something we need to fix. Do you plan to push the
>>>fix, or do you want me to do it?
>>>
>>
>>I can push it. Have you had a chance to review it?
>>
>
>Not yet, I'll take a look today.
>

OK, I took a look. I think from the correctness POV the patch is OK, but
I think the dependencies_clauselist_selectivity() function now got a bit
too complex. I've been able to parse it now, but I'm sure I'll have
trouble in the future :-(

Can we refactor / split it somehow and move bits of the logic to smaller
functions, or something like that?

Another thing I'd like to suggest is keeping the "old" formula, and
instead of just replacing it with

    P(a,b) = f * Min(P(a), P(b)) + (1-f) * P(a) * P(b)

but explaining how the old formula may produce nonsensical selectivity,
and how the new formula addresses that issue.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Nikita Glukhov
Дата:
Сообщение: Re: [PATCH] Opclass parameters
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: Small docs bugfix: make it clear what can be used in UPDATE FROMand DELETE USING