cross column correlation revisted

Поиск
Список
Период
Сортировка
От PostgreSQL - Hans-Jürgen Schönig
Тема cross column correlation revisted
Дата
Msg-id D0F6E707-701C-40C4-9F4B-D7D282AA0187@cybertec.at
обсуждение исходный текст
Ответы Re: cross column correlation revisted  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Список pgsql-hackers
hello everybody,

we are currently facing some serious issues with cross correlation issue.
consider: 10% of all people have breast cancer. we have 2 genders (50:50).
if i select all the men with breast cancer, i will get basically nobody - the planner will overestimate the output.
this is the commonly known problem ...

this cross correlation problem can be quite nasty in many many cases.
underestimated nested loops can turn joins into a never ending nightmare and so on and so on.

my ideas is the following:
what if we allow users to specifiy cross-column combinations where we keep separate stats?
maybe somehow like this ...
ALTER TABLE x SET CORRELATION STATISTICS FOR (id = id2 AND id3=id4)

or ...
ALTER TABLE x SET CORRELATION STATISTICS FOR (x.id = y.id AND x.id2 = y.id2)

clearly we cannot store correlation for all combinations of all columns so we somehow have to limit it.

what is the general feeling about something like that?
many thanks,
    hans

--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Markus Wanner
Дата:
Сообщение: Re: bg worker: overview
Следующее
От: Heikki Linnakangas
Дата:
Сообщение: Re: cross column correlation revisted