Re: BETWEEN optimizer problems with single-value

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: BETWEEN optimizer problems with single-value
Дата
Msg-id 11428.1142474724@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: BETWEEN optimizer problems with single-value  (Simon Riggs <simon@2ndquadrant.com>)
Ответы Re: BETWEEN optimizer problems with single-value  (Simon Riggs <simon@2ndquadrant.com>)
Список pgsql-performance
Simon Riggs <simon@2ndquadrant.com> writes:
>> ISTM that when the BETWEEN constants match we end up in this part of
>> clauselist_selectivity()...

Yeah, I think you are right.

> so that the planner underestimates the cost of using "Cal_CalDate" so
> that it ends up the same as "Cal_CtofcNo", and then we pick
> "Cal_CalDate" because it was created first.

No, it doesn't end up the same --- but the difference is small enough to
be in the roundoff-error regime.  The real issue here is that we're
effectively assuming that one row will be fetched from the index in both
cases, and this is clearly not the case for the Cal_CalDate index.  So
we need a more accurate estimate for the boundary case.

> Using 1.0e-10 isn't very useful... the selectivity for a range should
> never be less than the selectivity for an equality, so we should simply
> put in a test against one of the pseudo constants and use that as the
> minimal value.

That's easier said than done, because you'd first have to find the
appropriate equality operator to use (ie, one having semantics that
agree with the inequality operators).  Another point is that the above
statement is simply wrong, consider
    calDate BETWEEN '2006-03-15' AND '2006-03-14'
for which an estimate of zero really is correct.

Possibly we could drop this code's reliance on seeing
SCALARLTSEL/SCALARGTSEL as the estimators, and instead try to locate a
common btree opclass for the operators --- which would then let us
identify the right equality operator to use, and also let us distinguish
> from >= etc.  If we're trying to get the boundary cases right I
suspect we have to account for that.  I could see such an approach being
tremendously slow though :-(, because we'd go looking for btree
opclasses even for operators that have nothing to do with < or >.

            regards, tom lane

В списке pgsql-performance по дате отправления:

Предыдущее
От: Jan de Visser
Дата:
Сообщение: Re: Slow SELECTS after large update cycle
Следующее
От: "Guillaume Smet"
Дата:
Сообщение: PostgreSQL and Xeon MP