Re: Hypothetical suggestions for planner, indexing improvement

Поиск
Список
Период
Сортировка
От Zeugswetter Andreas SB SD
Тема Re: Hypothetical suggestions for planner, indexing improvement
Дата
Msg-id 46C15C39FEB2C44BA555E356FBCD6FA4961F99@m0114.s-mxs.net
обсуждение исходный текст
Ответ на Hypothetical suggestions for planner, indexing improvement  (Josh Berkus <josh@agliodbs.com>)
Ответы Re: Hypothetical suggestions for planner, indexing improvement  (Josh Berkus <josh@agliodbs.com>)
Список pgsql-hackers
> > The reason that I mention EXISTS is because that's where the lack of
> > cross-column corellation is most dramatic; the planner seems to estimate a
> > flat 50% for EXISTS clauses regardless of the content.
>
> No "seems to" about that one: see
> src/backend/optimizer/path/clausesel.c
>
>     else if (is_subplan(clause))
>     {
>         /*
>          * Just for the moment! FIX ME! - vadim 02/04/98
>          */
>         s1 = (Selectivity) 0.5;
>     }

I think the main issue cannot be the correlation in this case.
In this FK PK case an events row exists for each row in the subselect,
so the correlation must be above 1, how much above is irrelevant for the
selectivity estimate (it is only relevant in the below 1 cases).
The selectivity in this case can be estimated by estimating the number of rows
returned from the subselect where the (in the example missing) PK-FK join condition
is removed (here: select distinct event_id from event_day where event_day BETWEEN
'2003-04-08' AND '2003-05-18'). (selectivity = min (1, e. rows of subselect / e. rows of
main select))

The "event_day BETWEEN '2003-04-08' AND '2003-05-18'" is what really reduces the
result set here, and that is not used.

Andreas

PS: in the example the subselect join clause event_id=events.event_id is missing



В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Dave Page"
Дата:
Сообщение: Re: 7.4 features list
Следующее
От: Andreas Pflug
Дата:
Сообщение: contribute pg_get_viewdef2 et al