Обсуждение: Partial index where clause not filtering through

Поиск
Список
Период
Сортировка

Partial index where clause not filtering through

От
"Jim C. Nasby"
Дата:
I have the following index:
street_range__street_locality_high_low_v btree (street_name_id,
    locality_id, addr_high_v, addr_low_v) WHERE (addr_high_v IS NOT NULL)

The query has a where clause like this:
        FROM street_range s, input i
        WHERE 1=1
            AND i.address_v IS NOT NULL

            AND s.locality_id = i.locality_id
            AND s.street_name_id = i.street_name_id

            AND s.addr_low_v <= i.address_v
            AND s.addr_high_v >= i.address_v

As-is, it won't use the index. i.address_v IS NOT NULL AND s.addr_high_v
 >= i.address_v should mandate that s.addr_high_v must be not-null, if
I'm remembering how nulls work correctly. (Actually, having any kind of
comparison on s.addr_high_v should mandate NOT NULL since NULL != NULL,
right?) Therefore the optimizer should be able to deduce that it can use
the index.

Adding AND s.addr_high_v IS NOT NULL to the where clause makes
everything work fine, so there is a work-around. Just seems like a minor
item to add to the TODO.
--
Jim C. Nasby (aka Decibel!)                    jim@nasby.net
Member: Triangle Fraternity, Sports Car Club of America
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

Re: Partial index where clause not filtering through

От
Tom Lane
Дата:
"Jim C. Nasby" <jim@nasby.net> writes:
> As-is, it won't use the index. i.address_v IS NOT NULL AND s.addr_high_v
> >= i.address_v should mandate that s.addr_high_v must be not-null,

Actually, if the >= operator is strict then it implies both NOT NULL
conditions.  But I am not excited about putting some kind of theorem
prover into the partial-index logic.  That is a recipe for chewing up
huge numbers of cycles trying (and, likely, failing) to prove that
a partial index is safe to use with the current query.

Inference rules that are limited to strict operators and NOT NULL
clauses wouldn't cost as much as a general theorem prover, but they'd
not find useful improvements as often, either.  So the question is
still whether the game is worth the candle.  How often do you think
this would win, and is that worth the planner cycles expended on every
query to find out if it wins?

            regards, tom lane

Re: Partial index where clause not filtering through

От
"Jim C. Nasby"
Дата:
On Mon, Jun 16, 2003 at 01:43:34AM -0400, Tom Lane wrote:
> "Jim C. Nasby" <jim@nasby.net> writes:
> > As-is, it won't use the index. i.address_v IS NOT NULL AND s.addr_high_v
> > >= i.address_v should mandate that s.addr_high_v must be not-null,
>
> Actually, if the >= operator is strict then it implies both NOT NULL
> conditions.  But I am not excited about putting some kind of theorem
> prover into the partial-index logic.  That is a recipe for chewing up
> huge numbers of cycles trying (and, likely, failing) to prove that
> a partial index is safe to use with the current query.
>
> Inference rules that are limited to strict operators and NOT NULL
> clauses wouldn't cost as much as a general theorem prover, but they'd
> not find useful improvements as often, either.  So the question is
> still whether the game is worth the candle.  How often do you think
> this would win, and is that worth the planner cycles expended on every
> query to find out if it wins?

Well, it would only need to make the checks if the table had partial
indexes. Even then, it probably makes sense to only do the check if
other query planning steps decide it would be useful to use the partial
index. So that means that for a lot of general use cases, performance
won't be impacted.

When you get to the cases that would be impacted, the planner should
probably look for key clauses first; so if you were worried about
planning time, you would put an explicit clause in the query (I'm in the
habit of doing this for joins when joining three tables on the same
key... FROM a, b, c WHERE a.f1=b.f1 and b.f1=c.f1 and a.f1=c.f1. I would
hope the planner would figure out that a.f1 must = c.f1, but some
don't). In many cases, planning time isn't a big deal; either the query
is run often enough that it should stay in the plan cache (pgsql does
cache plans, right?), or it's run infrequently enough that it's not a
big deal.

Of course, this might extend well beyond just partial indexes, as my a,
b, c example shows.
--
Jim C. Nasby (aka Decibel!)                    jim@nasby.net
Member: Triangle Fraternity, Sports Car Club of America
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

Re: Partial index where clause not filtering through

От
Tom Lane
Дата:
"Jim C. Nasby" <jim@nasby.net> writes:
> Well, it would only need to make the checks if the table had partial
> indexes. Even then, it probably makes sense to only do the check if
> other query planning steps decide it would be useful to use the partial
> index.

You have that backwards.  Planning is bottom-up, so we have to determine
the relevant indexes *first*.  Accordingly, a partial index is a
performance drag on every query that uses its table, as we check to
see if the partial index qual is satisfied by the query's WHERE clause.
That's why I don't want it to be any slower than it is ...

            regards, tom lane

Re: Partial index where clause not filtering through

От
"Jim C. Nasby"
Дата:
On Mon, Jun 16, 2003 at 10:11:00AM -0400, Tom Lane wrote:
> "Jim C. Nasby" <jim@nasby.net> writes:
> > Well, it would only need to make the checks if the table had partial
> > indexes. Even then, it probably makes sense to only do the check if
> > other query planning steps decide it would be useful to use the partial
> > index.
>
> You have that backwards.  Planning is bottom-up, so we have to determine
> the relevant indexes *first*.  Accordingly, a partial index is a
> performance drag on every query that uses its table, as we check to
> see if the partial index qual is satisfied by the query's WHERE clause.
> That's why I don't want it to be any slower than it is ...

Well, could it assume the index was valid until we got to the point
where we had to decide what index to use? In other words, don't do the
test unless the index appears to be the most attractive one. Also, as I
mentioned, if query parsing performance is that important, you can
explicitly add whatever clause will show the planner that the index is
valid.

Also, I just read that there's no statement plan caching, which makes me
a bit confused by this todo:

Flush cached query plans when their underlying catalog data changes

Does that only apply to pl/pgsql? Are there plans to add a statement
cache?
--
Jim C. Nasby (aka Decibel!)                    jim@nasby.net
Member: Triangle Fraternity, Sports Car Club of America
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

Re: Partial index where clause not filtering through

От
Tom Lane
Дата:
"Jim C. Nasby" <jim@nasby.net> writes:
> Also, I just read that there's no statement plan caching, which makes me
> a bit confused by this todo:
> Flush cached query plans when their underlying catalog data changes
> Does that only apply to pl/pgsql?

That and PREPARE.

            regards, tom lane