Re: [HACKERS] BETWEEN optimizer problems with single-value

Поиск
Список
Период
Сортировка
От Kevin Grittner
Тема Re: [HACKERS] BETWEEN optimizer problems with single-value
Дата
Msg-id 44185029.EE98.0025.0@wicourts.gov
обсуждение исходный текст
Ответ на Re: [HACKERS] BETWEEN optimizer problems with single-value range  (Simon Riggs <simon@2ndquadrant.com>)
Список pgsql-performance
>>> On Wed, Mar 15, 2006 at  5:05 pm, in message
<1142463908.3859.188.camel@localhost.localdomain>, Simon Riggs
<simon@2ndquadrant.com> wrote:
> On Wed, 2006- 03- 15 at 11:56 - 0600, Kevin Grittner wrote:
>
>> (One obvious way to fix it would be to
>> rewrite "BETWEEN a AND b" as "= a" when a is equal to b, but it
seems
>> like there is some underlying problem which should be fixed instead
(or
>> in addition to) this.
>
> That might work, but I'm not sure if that is in itself the problem
and
> it would be mostly wasted overhead in 99% of cases.

It sounds like we agree.

> The main issue appears to be that the planner chooses "Cal_CalDate"
> index rather than "Cal_CtofcNo" index when the BETWEEN values match.


Agreed.

> It seems that the cost of the first and third EXPLAINs is equal, yet
for
> some reason it chooses different indexes in each case. My
understanding
> was that it would pick the first index created if plan costs were
equal.
> Is that behaviour repeatable with each query?

It seems to be a consistent pattern, although strictly speaking our
evidence is anecdotal.  We've got hundreds of known failures with the
BETWEEN variant on equal dates and no known successes.  We have a few
dozen tests of the equality variant with 100% success in those tests.

> ISTM that if we have equal plan costs then we should be choosing the
> index for which we have more leading columns, since that is more
likely
> to lead to a more selective answer. But the plan selection is a
simple
> "pick the best, or if they're equal pick the best sort order".

> The selectivity seems the same in both -  clamped to a minimum of 1
row,
> so changing that doesn't look like it would help.

The fact that it costs these as equivalent is surprising in itself, and
might be worth examining.  This might be an example of something I
suggested a while ago -- that the rounding a row estimate to an integer
on the basis that "you can't read half a row" is not necessarily wise,
because you can have a 50% chance of reading a row versus a higher or
lower percentage.

-Kevin



В списке pgsql-performance по дате отправления:

Предыдущее
От: Simon Riggs
Дата:
Сообщение: Re: Slow SELECTS after large update cycle
Следующее
От: Simon Riggs
Дата:
Сообщение: Re: BETWEEN optimizer problems with single-value