Re: WIP: multivariate statistics / proof of concept

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: WIP: multivariate statistics / proof of concept
Дата
Msg-id 8106e11197849725375a933e1cc1409f.squirrel@2.emaily.eu
обсуждение исходный текст
Ответ на Re: WIP: multivariate statistics / proof of concept  (Katharina Büchse<katharina.buechse@uni-jena.de>)
Ответы Re: WIP: multivariate statistics / proof of concept  (Kevin Grittner <kgrittn@ymail.com>)
Список pgsql-hackers
Dne 13 Listopad 2014, 16:51, Katharina Büchse napsal(a):
> On 13.11.2014 14:11, Tomas Vondra wrote:
>
>> The only place where I think this might work are the associative rules.
>> It's simple to specify rules like ("ZIP code" implies "city") and we
>> could
>> even do some simple check against the data to see if it actually makes
>> sense (and 'disable' the rule if not).
>
> and even this simple example has its limits, at least in Germany ZIP
> codes are not unique for rural areas, where several villages have the
> same ZIP code.
>
> I guess there are just a few examples where columns are completely
> functional dependent without any exceptions.
> But of course, if the user gives this information just for optimization
> the statistics, some exceptions don't matter.
> If this information should be used for creating different execution
> plans (e.g. on column A is an index and column B is functional
> dependent, one could think about using this index on A and the
> dependency instead of running through the whole table to find all tuples
> that fit the query on column B), exceptions are a very important issue.

Yes, exactly. The aim of this patch is "only" improving estimates, not
removing conditions from the plan (e.g. checking only the ZIP code and not
the city name). That certainly can't be done solely based on approximate
statistics, and as you point out most real-world data either contain bugs
or are inherently imperfect (we have the same kind of ZIP/city
inconsistencies in Czech). That's not a big issue for estimates (assuming
only small fraction of rows violates the rule) though.

Tomas




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andrew Dunstan
Дата:
Сообщение: Re: controlling psql's use of the pager a bit more
Следующее
От: Michael Banck
Дата:
Сообщение: Re: controlling psql's use of the pager a bit more