Обсуждение: Query planner isn't using my indices

Поиск
Список
Период
Сортировка

Query planner isn't using my indices

От
"Alaric B. Snell"
Дата:
To cut a long story short, my largish development database was running the
query I was tinkering with very slowly.

Looking a little deeper, I found that it was always doing a full table
scan.

Which is odd, seeing as we're selecting on a uniquely indexed field...

frontwire=# \d stakeholder_pk
Index "stakeholder_pk"
 Attribute |  Type
-----------+--------
 id        | bigint
unique btree

frontwire=# explain select * from stakeholder where id = 1;
NOTICE:  QUERY PLAN:

Seq Scan on stakeholder  (cost=0.00..602.81 rows=1 width=336)

EXPLAIN
frontwire=# select count(*) from stakeholder;
 count
-------
  9170
(1 row)

...why is this happening? It... shouldn't!

ABS

--
Alaric B. Snell, Developer
abs@frontwire.com


Re: Query planner isn't using my indices

От
Tom Lane
Дата:
"Alaric B. Snell" <abs@frontwire.com> writes:
> frontwire=# \d stakeholder_pk
> Index "stakeholder_pk"
>  Attribute |  Type
> -----------+--------
>  id        | bigint
> unique btree

> frontwire=# explain select * from stakeholder where id = 1;
> NOTICE:  QUERY PLAN:

> Seq Scan on stakeholder  (cost=0.00..602.81 rows=1 width=336)

Try "where id = 1::bigint" or "where id = '1'".

Unadorned 1 is an int4, not an int8 ... see archives for previous
discussions.

            regards, tom lane

Re: Query planner isn't using my indices

От
Stephan Szabo
Дата:
On Tue, 8 Jan 2002, Alaric B. Snell wrote:

> frontwire=# \d stakeholder_pk
> Index "stakeholder_pk"
>  Attribute |  Type
> -----------+--------
>  id        | bigint
> unique btree
>
> frontwire=# explain select * from stakeholder where id = 1;
> NOTICE:  QUERY PLAN:
>
> Seq Scan on stakeholder  (cost=0.00..602.81 rows=1 width=336)
>

See past discussions, you'll need to either explicitly cast the
constant to bigint or quote it, because otherwise it prematurely
casts the constant into an int4.


Re: Query planner isn't using my indices

От
Doug McNaught
Дата:
"Alaric B. Snell" <abs@frontwire.com> writes:

> To cut a long story short, my largish development database was running the
> query I was tinkering with very slowly.
>
> Looking a little deeper, I found that it was always doing a full table
> scan.
>
> Which is odd, seeing as we're selecting on a uniquely indexed field...
>
> frontwire=# \d stakeholder_pk
> Index "stakeholder_pk"
>  Attribute |  Type
> -----------+--------
>  id        | bigint
> unique btree
>
> frontwire=# explain select * from stakeholder where id = 1;
> NOTICE:  QUERY PLAN:
>
> Seq Scan on stakeholder  (cost=0.00..602.81 rows=1 width=336)

The query planner isn't *quite* smart enough currently to realize that
it can use the index in this case, because an unadorned "1" is an int
(not bigint) constant.  Try:

EXPLAIN SELECT * FROM stakeholder WHERE id = 1::bigint;

or

EXPLAIN SELECT * FROM stakeholder WHERE id = '1';

[I think either will work]

-Doug
--
Let us cross over the river, and rest under the shade of the trees.
   --T. J. Jackson, 1863

Re: Query planner isn't using my indices

От
Jason Earl
Дата:
If you cast the 1 to type bigint then PostgreSQL will use the index,
otherwise it won't.  You can do one of four things:

1) An explicit cast (btw what's the SQL92 way of doing this):

        SELECT * FROM stakeholder WHERE id = 1::bigint;

2) Put the constant in "'" and let PostgreSQL work out what to do:

        SELECT * FROM stakeholder WHERE id = '1';

3) Turn off sequential scans (see the manual).

4) Live with sequential scans :).

Jason


"Alaric B. Snell" <abs@frontwire.com> writes:

> To cut a long story short, my largish development database was running the
> query I was tinkering with very slowly.
>
> Looking a little deeper, I found that it was always doing a full table
> scan.
>
> Which is odd, seeing as we're selecting on a uniquely indexed field...
>
> frontwire=# \d stakeholder_pk
> Index "stakeholder_pk"
>  Attribute |  Type
> -----------+--------
>  id        | bigint
> unique btree
>
> frontwire=# explain select * from stakeholder where id = 1;
> NOTICE:  QUERY PLAN:
>
> Seq Scan on stakeholder  (cost=0.00..602.81 rows=1 width=336)
>
> EXPLAIN
> frontwire=# select count(*) from stakeholder;
>  count
> -------
>   9170
> (1 row)
>
> ...why is this happening? It... shouldn't!
>
> ABS
>
> --
> Alaric B. Snell, Developer
> abs@frontwire.com
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo@postgresql.org so that your
> message can get through to the mailing list cleanly

Re: Query planner isn't using my indices

От
"Alaric B. Snell"
Дата:
On 8 Jan 2002, Jason Earl wrote:

> If you cast the 1 to type bigint then PostgreSQL will use the index,
> otherwise it won't.  You can do one of four things:

Ahhh, suddenly it all makes sense!

Thanks, people.

Now I need to figure out how to make JBoss write its queries differently
:-)

ABS

--
Alaric B. Snell, Developer
abs@frontwire.com


Re: Query planner isn't using my indices

От
Shaun Thomas
Дата:
On Tue, 8 Jan 2002, Stephan Szabo wrote:

> See past discussions, you'll need to either explicitly cast the
> constant to bigint or quote it, because otherwise it prematurely
> casts the constant into an int4.

But isn't that the point?  The fact that it casts it to int4 isn't the
problem.  The fact that int4 and int8 *should* be compatible fields *is*.
You'd think the query optimizer would do this:

"Hmmm, I have a field composed of numbers.  Are there any indexes
corresponding to this field?  Are the types compatible?  They're pretty
close, let me just recast this... there."

Oracle does this, Mysql does this, Sybase does this... why must Postgres's
query optimizer assume all types are spot-on identical?  Similarly, why
do I have to cast my char column to text when doing uncasted quoted-string
concatenation?  Quoted strings are character data, and char columns are
character data, right?  Wouldn't it logically follow that the
two types are compatible?  Unless for some reason the database engine is
mapping the database column types directly to C types, there is no
reason for this.

Postgres's type-checking abilities are sorely lacking to such a degree
that it not only hurts database performance unless the DBA writes every
damn application, but it forces any application to be postgres specific
due to all of the unnecessary casting.

Doesn't this bother anyone?  Isn't the query optimizer important enough to
be made as "smart" as possible, since it affects the performance of the
entire database?  Should a company hosting postgresql services really be
held ransom by customers who write "bad" queries because they don't know
every little quirk of the database?  Bueller?  McFly?

Can anyone explain this to me?  Anyone at all?  I mean, what logical
reason is there to leave the database so crippled?

--
+-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-+
| Shaun M. Thomas                INN Database Programmer              |
| Phone: (309) 743-0812          Fax  : (309) 743-0830                |
| Email: sthomas@townnews.com    AIM  : trifthen                      |
| Web  : www.townnews.com                                             |
|                                                                     |
|     "Most of our lives are about proving something, either to       |
|      ourselves or to someone else."                                 |
|                                           -- Anonymous              |
+-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-+



Re: Query planner isn't using my indices

От
Doug McNaught
Дата:
Shaun Thomas <sthomas@townnews.com> writes:

> On Tue, 8 Jan 2002, Stephan Szabo wrote:
>
> > See past discussions, you'll need to either explicitly cast the
> > constant to bigint or quote it, because otherwise it prematurely
> > casts the constant into an int4.
>
> But isn't that the point?  The fact that it casts it to int4 isn't the
> problem.  The fact that int4 and int8 *should* be compatible fields *is*.

[rant...]

> Can anyone explain this to me?  Anyone at all?  I mean, what logical
> reason is there to leave the database so crippled?

Finite developer resources?  Feel free to pitch in once the 7.3 cycle
starts.

FWIW, the following is on Bruce's TODO list at
http://developer.postgresql.org/todo.php:

  * Allow better handling of numeric constants, type conversion

There is a link there to a discussion about ways to improve the
situation.

-Doug
--
Let us cross over the river, and rest under the shade of the trees.
   --T. J. Jackson, 1863

Re: Query planner isn't using my indices

От
Tom Lane
Дата:
Shaun Thomas <sthomas@townnews.com> writes:
> Can anyone explain this to me?  Anyone at all?  I mean, what logical
> reason is there to leave the database so crippled?

Feel free to step right up and fix it.  This is an open-source project.

Whining aside, the reason why these restrictions still exist is that
we have an extensible type system and so we don't really want to put
hard-wired knowledge about different types into the parser or optimizer.
The difficult part is to design a general-purpose, extensible framework
in which these desired conversions can be specified.

There have been prior discussions of how to do this (see the pghackers
archives); but no one's put forward a really satisfactory proposal yet.
If anyone had, it'd probably have been implemented by now --- there
isn't anyone here who likes the current behavior any more than you do.

            regards, tom lane

Re: Query planner isn't using my indices

От
Stephan Szabo
Дата:
On Wed, 9 Jan 2002, Shaun Thomas wrote:

> On Tue, 8 Jan 2002, Stephan Szabo wrote:
>
> > See past discussions, you'll need to either explicitly cast the
> > constant to bigint or quote it, because otherwise it prematurely
> > casts the constant into an int4.
>
> Can anyone explain this to me?  Anyone at all?  I mean, what logical
> reason is there to leave the database so crippled?

Perhaps you should see the past discussions (particularly on hackers)
of this topic.



Re: Query planner isn't using my indices

От
"Alaric B. Snell"
Дата:
On 9 Jan 2002, Doug McNaught wrote:

> Finite developer resources?  Feel free to pitch in once the 7.3 cycle
> starts.

Smile, everyone!

I'm going to pitch to my employer to let me fix it on company time, since
it's causing problems for us with JBoss (which generates most of the
queries for us, out of our control).

ABS

--
Alaric B. Snell, Developer
abs@frontwire.com



Re: Query planner isn't using my indices

От
Martijn van Oosterhout
Дата:
On Thu, Jan 10, 2002 at 12:18:33PM +0000, Alaric B. Snell wrote:
> On 9 Jan 2002, Doug McNaught wrote:
>
> > Finite developer resources?  Feel free to pitch in once the 7.3 cycle
> > starts.
>
> Smile, everyone!
>
> I'm going to pitch to my employer to let me fix it on company time, since
> it's causing problems for us with JBoss (which generates most of the
> queries for us, out of our control).

Interestingly, I find the easiest fix is: never use unquoted constants.

If it didn't break other peoples queries I'd suggest modifying the parser so
that numbers are parsed as strings of type unknown instead of numbers. Then
let the normal type matching system go to work on it.

That said, a good system for type promotion would be cool.
--
Martijn van Oosterhout <kleptog@svana.org>
http://svana.org/kleptog/
> Terrorists can only take my life. Only my government can take my freedom.

Re: Query planner isn't using my indices

От
Tom Lane
Дата:
Martijn van Oosterhout <kleptog@svana.org> writes:
> Interestingly, I find the easiest fix is: never use unquoted constants.

> If it didn't break other peoples queries I'd suggest modifying the parser so
> that numbers are parsed as strings of type unknown instead of numbers.

Well, we don't want to lose the information that the input looked like a
number and not a string; that's a reasonable thing for the type
heuristics to use.  There was a proposal on the table to initially treat
numeric literals as type UNKNOWNNUMERIC and then have a resolution
mechanism for that, much like UNKNOWN but with the possible target types
restricted to the numeric category.  It still had some problems though.
One that I recall was that OIDs weren't accounted for; another was that
it's not real clear how to let the size of the value affect resolution
(eg, it's reasonable to resolve 123456 as int4 or int8, but not int2).

I think we had also talked about having the scanner initially resolve
a literal as the "smallest" possible containing type and then make sure
there are useful automatic up-conversions in place.  The trouble here
is that the numeric hierarchy isn't strictly ordered (think about float8
vs numeric; OID is also a fly in the ointment) so "smallest" isn't very
well defined.

Something I have been thinking about is that we may actually have too
many entries in pg_proc and pg_operator.  Earlier in this thread,
someone complained about this:

regression=# select 'foo'::char(3) || 'bar'::varchar;
ERROR:  Unable to identify an operator '||' for types 'character' and 'character varying'
        You will have to retype this query using an explicit cast

But it works fine if one of the operands is text:

regression=# select 'foo'::char(3) || 'bar'::text;
 ?column?
----------
 foobar
(1 row)

Why is that?  It's because we have too many || operators:

regression=# \do ||
                                    List of operators
 Name |   Left arg type   |  Right arg type   |    Result type    |      Description
------+-------------------+-------------------+-------------------+-------------
----------
 ||   | bit               | bit               | bit               | bitwise concatenation
 ||   | bytea             | bytea             | bytea             | concatenate
 ||   | character         | character         | character         | concatenate
 ||   | character varying | character varying | character varying | concatenate
 ||   | text              | text              | text              | concatenate
(5 rows)

There's no exact match for char || varchar, and the resolver can't figure
out whether to prefer char || char or varchar || varchar, so it punts.
OTOH text is the "preferred type" in the string category, so it will
select text || text if either operand is text.

IMHO the correct fix is to remove the entries for char || char and
varchar || varchar, so that there's only one textual || operator,
namely text || text.  This would mean the result was always considered
text and never char or varchar, but so what?  Those types are all
assignment-compatible anyway.

A related thought is that our numeric-hierarchy problems might be
greatly simplified if we eliminated most or all of the cross-datatype
numeric operators (eg, int4 plus int2) and insisted that these be
handled as a type coercion step plus a single-datatype operator.
In that way the resolver would be much less likely to be faced with
multiple alternatives that it couldn't choose among.

See the pghackers archives for past discussions if you want to
pursue this.

            regards, tom lane