Re: [HACKERS] PATCH: multivariate histograms and MCV lists

Поиск
Список
Период
Сортировка
От Dean Rasheed
Тема Re: [HACKERS] PATCH: multivariate histograms and MCV lists
Дата
Msg-id CAEZATCWORz=bXEPVJNxK49Ws9zeBHqsEdd3JsrrZWMvQ8oXuSA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] PATCH: multivariate histograms and MCV lists  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Ответы Re: [HACKERS] PATCH: multivariate histograms and MCV lists  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список pgsql-hackers
On 16 July 2018 at 13:23, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>>> The top-level clauses allow us to make such deductions, with deeper
>>> clauses it's much more difficult (perhaps impossible). Because for
>>> example with (a=1 AND b=1) there can be just a single match, so if we
>>> find it in MCV we're done. With clauses like ((a=1 OR a=2) AND (b=1 OR
>>> b=2)) it's not that simple, because there may be multiple combinations
>>> and so a match in MCV does not guarantee anything.
>>
>> Actually, it guarantees a lower bound on the overall selectivity, and
>> maybe that's the best that we can do in the absence of any other
>> stats.
>>
> Hmmm, is that actually true? Let's consider a simple example, with two
> columns, each with just 2 values, and a "perfect" MCV list:
>
>     a | b | frequency
>    -------------------
>     1 | 1 | 0.5
>     2 | 2 | 0.5
>
> And let's estimate sel(a=1 & b=2).

OK.In this case, there are no MCV matches, so there is no lower bound (it's 0).

What we could do though is also impose an upper bound, based on the
sum of non-matching MCV frequencies. In this case, the upper bound is
also 0, so we could actually say the resulting selectivity is 0.

Regards,
Dean


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: Finding database for pg_upgrade missing library
Следующее
От: Peter Eisentraut
Дата:
Сообщение: Re: cursors with prepared statements