Re: Cross-column statistics revisited

Поиск

Список

Период

Сортировка

От	Greg Stark
Тема	Re: Cross-column statistics revisited
Дата	16 октября 2008 г. 14:32:44
Msg-id	B71B9E9E-3F8D-48B2-9D99-A342AB043322@enterprisedb.com обсуждение исходный текст
Ответ на	Re: Cross-column statistics revisited (Tom Lane <tgl@sss.pgh.pa.us>)
Список	pgsql-hackers

Дерево обсуждения

[sorry for top osting - dam phone]

It's pretty straightforward to to a chi-squared test on all the pairs.  
But that tells you that the product is more likely to be wrong. It  
doesn't tell you whether it's going to be too high or too low...

greg

On 16 Oct 2008, at 07:20 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Martijn van Oosterhout <kleptog@svana.org> writes:
>> I think you need to go a step back: how are you going to use this  
>> data?
>
> The fundamental issue as the planner sees it is not having to assume
> independence of WHERE clauses.  For instance, given
>
>    WHERE a < 5 AND b > 10
>
> our current approach is to estimate the fraction of rows with a < 5
> (using stats for a), likewise estimate the fraction with b > 10
> (using stats for b), and then multiply these fractions together.
> This is correct if a and b are independent, but can be very bad if
> they aren't.  So if we had joint statistics on a and b, we'd want to
> somehow match that up to clauses for a and b and properly derive
> the joint probability.
>
> (I'm not certain of how to do that efficiently, even if we had the
> right stats :-()
>
>            regards, tom lane
>
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Cross-column statistics revisited