Re: proposal : cross-column stats

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: proposal : cross-column stats
Дата
Msg-id AANLkTimWyk6qso=_BethgFSYN4FBZa-syAVjtUxjssuo@mail.gmail.com
обсуждение исходный текст
Ответ на Re: proposal : cross-column stats  (Tomas Vondra <tv@fuzzy.cz>)
Ответы Re: proposal : cross-column stats  (Tomas Vondra <tv@fuzzy.cz>)
Список pgsql-hackers
On Sun, Dec 12, 2010 at 8:46 PM, Tomas Vondra <tv@fuzzy.cz> wrote:
> Dne 13.12.2010 01:05, Robert Haas napsal(a):
>> This is a good idea, but I guess the question is what you do next.  If
>> you know that the "applicability" is 100%, you can disregard the
>> restriction clause on the implied column.  And if it has no
>> implicatory power, then you just do what we do now.  But what if it
>> has some intermediate degree of implicability?
>
> Well, I think you've missed the e-mail from Florian Pflug - he actually
> pointed out that the 'implicativeness' Heikki mentioned is called
> conditional probability. And conditional probability can be used to
> express the "AND" probability we are looking for (selectiveness).
>
> For two columns, this is actually pretty straighforward - as Florian
> wrote, the equation is
>
>   P(A and B) = P(A|B) * P(B) = P(B|A) * P(A)

Well, the question is what data you are actually storing.  It's
appealing to store a measure of the extent to which a constraint on
column X constrains column Y, because you'd only need to store
O(ncolumns^2) values, which would be reasonably compact and would
potentially handle the zip code problem - a classic "hard case" rather
neatly.  But that wouldn't be sufficient to use the above equation,
because there A and B need to be things like "column X has value x",
and it's not going to be practical to store a complete set of MCVs for
column X for each possible value that could appear in column Y.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tomas Vondra
Дата:
Сообщение: Re: proposal : cross-column stats
Следующее
От: Tomas Vondra
Дата:
Сообщение: Re: proposal : cross-column stats