Обсуждение: Index usage with functions in where condition

Поиск
Список
Период
Сортировка

Index usage with functions in where condition

От
Jeremy Palmer
Дата:
I'm having trouble getting the query planner to use indexes. The situation occurs when writing a query that uses
functionsfor defining the parameters for the conditions on the indexed columns. The system I'm running is Windows
Server2003, using version 8.4.2 of PostgreSQL. 

This is the following table that I'm running my query against:

CREATE TABLE crs_coordinate
(
  id integer NOT NULL,
  nod_id integer NOT NULL,
  value1 numeric(22,12),
  value2 numeric(22,12),
  CONSTRAINT crs_coordinate_pkey PRIMARY KEY (id)
)
WITH (
  OIDS=FALSE
);

CREATE INDEX coo_value1 ON crs_coordinate USING btree (value1);
CREATE INDEX coo_value2 ON crs_coordinate USING btree (value2);

This table has 23 million rows in it and was analysed just before planning my queries.

This is the query that does not use the indexes:

SELECT
  coo.nod_id,
  6400000*radians(sqrt((coo.value1 - -41.0618)^2+((coo.value2 - 175.58461)*cos(radians(-41.0618)))^2)) as distance
FROM
  crs_coordinate coo
WHERE
  coo.value1 between -41.0618-degrees(1200.0/6400000.0) and -41.0618+degrees(1200.0/6400000.0) and
  coo.value2 between 175.58461-degrees(1200.0/6400000.0)/(cos(radians(-41.0618))) and
175.58461+degrees(1200.0/6400000.0)/(cos(radians(-41.0618)));

Seq Scan on crs_coordinate coo  (cost=0.00..1039607.49 rows=592 width=28)
  Filter: (((value1)::double precision >= (-41.0725429586587)::double precision) AND ((value1)::double precision <=
(-41.0510570413413)::doubleprecision) AND ((value2)::double precision >= 175.570362072701::double precision) AND
((value2)::doubleprecision <= 175.598857927299::double precision)) 

However if I pre-evaluated the parameters for the where condition on the value1 and value2 columns, the planner chooses
touse the indexes: 

SELECT
  coo.nod_id,
  6400000*radians(sqrt((coo.value1 - -41.0618)^2+((coo.value2 - 175.58461)*cos(radians(-41.0618)))^2)) as distance
FROM
  crs_coordinate coo
WHERE
  coo.value1 BETWEEN -41.07254296 AND -41.05105704 AND
  coo.value2 BETWEEN 175.57036207 AND 175.59885792;

Bitmap Heap Scan on crs_coordinate coo  (cost=5299.61..6705.41 rows=356 width=28)
  Recheck Cond: ((value1 >= (-41.07254296)) AND (value1 <= (-41.05105704)) AND (value2 >= 175.57036207) AND (value2 <=
175.59885792))
  ->  BitmapAnd  (cost=5299.61..5299.61 rows=356 width=0)
        ->  Bitmap Index Scan on coo_value1  (cost=0.00..1401.12 rows=54923 width=0)
              Index Cond: ((value1 >= (-41.07254296)) AND (value1 <= (-41.05105704)))
        ->  Bitmap Index Scan on coo_value2  (cost=0.00..3898.06 rows=153417 width=0)
              Index Cond: ((value2 >= 175.57036207) AND (value2 <= 175.59885792))

So why is the first query not using the indexes on the value1 and value2 columns? I'm assuming that both the COS and
RAIDIANSfunctions are STRICT IMMUTABLE, so logically the evaluation of these functions in the where clause should be
inlined.Looking at the query plan this inlining does seem to be happening... 

At this stage I have a work around by putting the query into a plpgsql function and using dynamic SQL. But it is still
frustratingwhy the planner seems to be working in a far from optimal fashion. Can anyone shed some light on this for
me?

Thanks,
Jeremy

______________________________________________________________________________________________________

This message contains information, which is confidential and may be subject to legal privilege.
If you are not the intended recipient, you must not peruse, use, disseminate, distribute or copy this message.
If you have received this message in error, please notify us immediately (Phone 0800 665 463 or info@linz.govt.nz) and
destroythe original message. 
LINZ accepts no responsibility for changes to this email, or for any attachments, after its transmission from LINZ.

Thank you.
______________________________________________________________________________________________________

Re: Index usage with functions in where condition

От
Tom Lane
Дата:
Jeremy Palmer <JPalmer@linz.govt.nz> writes:
> This is the query that does not use the indexes:

> SELECT
>   coo.nod_id,
>   6400000*radians(sqrt((coo.value1 - -41.0618)^2+((coo.value2 - 175.58461)*cos(radians(-41.0618)))^2)) as distance
> FROM
>   crs_coordinate coo
> WHERE
>   coo.value1 between -41.0618-degrees(1200.0/6400000.0) and -41.0618+degrees(1200.0/6400000.0) and
>   coo.value2 between 175.58461-degrees(1200.0/6400000.0)/(cos(radians(-41.0618))) and
175.58461+degrees(1200.0/6400000.0)/(cos(radians(-41.0618)));

Those expressions yield float8, not numeric, and numeric vs float8 isn't
an indexable operator for reasons we needn't get into here.  You should
probably rethink whether numeric is really the best choice of datatype
for your columns, if this is the sort of value you expect to work with
--- you're paying a considerable price in speed and space for
perhaps-illusory precision gains.  But if you insist on using numeric
then the solution is to cast the expression results to numeric
explicitly.

BTW I wonder whether you ought not be looking into postgis rather than
rolling-your-own coordinate arithmetic ...

            regards, tom lane

Re: Index usage with functions in where condition

От
Jeremy Palmer
Дата:
Hi Tom,

Thanks for the help - much appreciated.

Yes I'm using PostGIS, and with a simply join to a relating table I could get access to the geometry for these point
positions.Is using the GIST r-tree index faster than using the 2 b-tree indexes on the lat and long values? I guess
thisis a question for the PostGIS guys and a quick test could tell me anyway! My memory is that the GIST r-tree index
isslow for points at the moment, and that a good implementation of a kd-tree index over GIST is required for better
speed.

Regards,

Jeremy Palmer
Geodetic Surveyor
National Geodetic Office

Land Information New Zealand | Toitu te whenua
160 Lambton Quay | Private Box 5501 | Wellington 6145

DDI: 64 (0)4 498 3537 | Fax: 64 (0)4 498 3837 | www.linz.govt.nz


-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Saturday, 10 July 2010 11:20 a.m.
To: Jeremy Palmer
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Index usage with functions in where condition

Jeremy Palmer <JPalmer@linz.govt.nz> writes:
> This is the query that does not use the indexes:

> SELECT
>   coo.nod_id,
>   6400000*radians(sqrt((coo.value1 - -41.0618)^2+((coo.value2 - 175.58461)*cos(radians(-41.0618)))^2)) as distance
> FROM
>   crs_coordinate coo
> WHERE
>   coo.value1 between -41.0618-degrees(1200.0/6400000.0) and -41.0618+degrees(1200.0/6400000.0) and
>   coo.value2 between 175.58461-degrees(1200.0/6400000.0)/(cos(radians(-41.0618))) and
175.58461+degrees(1200.0/6400000.0)/(cos(radians(-41.0618)));

Those expressions yield float8, not numeric, and numeric vs float8 isn't
an indexable operator for reasons we needn't get into here.  You should
probably rethink whether numeric is really the best choice of datatype
for your columns, if this is the sort of value you expect to work with
--- you're paying a considerable price in speed and space for
perhaps-illusory precision gains.  But if you insist on using numeric
then the solution is to cast the expression results to numeric
explicitly.

BTW I wonder whether you ought not be looking into postgis rather than
rolling-your-own coordinate arithmetic ...

            regards, tom lane
______________________________________________________________________________________________________

This message contains information, which is confidential and may be subject to legal privilege.
If you are not the intended recipient, you must not peruse, use, disseminate, distribute or copy this message.
If you have received this message in error, please notify us immediately (Phone 0800 665 463 or info@linz.govt.nz) and
destroythe original message. 
LINZ accepts no responsibility for changes to this email, or for any attachments, after its transmission from LINZ.

Thank you.
______________________________________________________________________________________________________