Re: Collect frequency statistics for arrays

Поиск
Список
Период
Сортировка
От Alexander Korotkov
Тема Re: Collect frequency statistics for arrays
Дата
Msg-id CAPpHfdvm1z0dQ-v0=_+QF_Ws8LXfE_75xQ-n4dzR6eyffh213Q@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Collect frequency statistics for arrays  (Noah Misch <noah@leadboat.com>)
Ответы Re: Collect frequency statistics for arrays  (Noah Misch <noah@leadboat.com>)
Список pgsql-hackers
Hi!

Updated patch is attached. I've updated comment of mcelem_array_contained_selec with more detailed description of probability distribution assumption. Also, I found that "rest" behavious should be better described by Poisson distribution, relevant changes were made.

On Tue, Jan 17, 2012 at 2:33 PM, Noah Misch <noah@leadboat.com> wrote:
By "summary frequency of elements", do you mean literally P_0 + P_1 ... + P_N?
If so, I can follow the above argument for "column && const" and "column <@
const", but not for "column @> const".  For "column @> const", selectivity
cannot exceed the smallest frequency among const elements.  A number of
high-frequency elements will drive up the sum of the frequencies without
changing the true selectivity much at all.
Referencing to summary frequency is not really correct. It would be more correct to reference to number of element in "const". When there are many elements in "const", "column @> const" selectivity tends to be close to 0 and  "column @> const" tends to be close to 1. Surely, it's true when elements have some kind of middle values of frequencies (not very close to 0 and not very close to 1). I've replaced "summary frequency of elements" by "number of elements".

------
With best regards,
Alexander Korotkov.
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Simon Riggs
Дата:
Сообщение: Re: [PATCH] Support for foreign keys with arrays
Следующее
От: Mikko Tiihonen
Дата:
Сообщение: Re: Optimize binary serialization format of arrays with fixed size elements