Обсуждение: Index usage with functions in where condition
I'm having trouble getting the query planner to use indexes. The situation occurs when writing a query that uses functionsfor defining the parameters for the conditions on the indexed columns. The system I'm running is Windows Server2003, using version 8.4.2 of PostgreSQL. This is the following table that I'm running my query against: CREATE TABLE crs_coordinate ( id integer NOT NULL, nod_id integer NOT NULL, value1 numeric(22,12), value2 numeric(22,12), CONSTRAINT crs_coordinate_pkey PRIMARY KEY (id) ) WITH ( OIDS=FALSE ); CREATE INDEX coo_value1 ON crs_coordinate USING btree (value1); CREATE INDEX coo_value2 ON crs_coordinate USING btree (value2); This table has 23 million rows in it and was analysed just before planning my queries. This is the query that does not use the indexes: SELECT coo.nod_id, 6400000*radians(sqrt((coo.value1 - -41.0618)^2+((coo.value2 - 175.58461)*cos(radians(-41.0618)))^2)) as distance FROM crs_coordinate coo WHERE coo.value1 between -41.0618-degrees(1200.0/6400000.0) and -41.0618+degrees(1200.0/6400000.0) and coo.value2 between 175.58461-degrees(1200.0/6400000.0)/(cos(radians(-41.0618))) and 175.58461+degrees(1200.0/6400000.0)/(cos(radians(-41.0618))); Seq Scan on crs_coordinate coo (cost=0.00..1039607.49 rows=592 width=28) Filter: (((value1)::double precision >= (-41.0725429586587)::double precision) AND ((value1)::double precision <= (-41.0510570413413)::doubleprecision) AND ((value2)::double precision >= 175.570362072701::double precision) AND ((value2)::doubleprecision <= 175.598857927299::double precision)) However if I pre-evaluated the parameters for the where condition on the value1 and value2 columns, the planner chooses touse the indexes: SELECT coo.nod_id, 6400000*radians(sqrt((coo.value1 - -41.0618)^2+((coo.value2 - 175.58461)*cos(radians(-41.0618)))^2)) as distance FROM crs_coordinate coo WHERE coo.value1 BETWEEN -41.07254296 AND -41.05105704 AND coo.value2 BETWEEN 175.57036207 AND 175.59885792; Bitmap Heap Scan on crs_coordinate coo (cost=5299.61..6705.41 rows=356 width=28) Recheck Cond: ((value1 >= (-41.07254296)) AND (value1 <= (-41.05105704)) AND (value2 >= 175.57036207) AND (value2 <= 175.59885792)) -> BitmapAnd (cost=5299.61..5299.61 rows=356 width=0) -> Bitmap Index Scan on coo_value1 (cost=0.00..1401.12 rows=54923 width=0) Index Cond: ((value1 >= (-41.07254296)) AND (value1 <= (-41.05105704))) -> Bitmap Index Scan on coo_value2 (cost=0.00..3898.06 rows=153417 width=0) Index Cond: ((value2 >= 175.57036207) AND (value2 <= 175.59885792)) So why is the first query not using the indexes on the value1 and value2 columns? I'm assuming that both the COS and RAIDIANSfunctions are STRICT IMMUTABLE, so logically the evaluation of these functions in the where clause should be inlined.Looking at the query plan this inlining does seem to be happening... At this stage I have a work around by putting the query into a plpgsql function and using dynamic SQL. But it is still frustratingwhy the planner seems to be working in a far from optimal fashion. Can anyone shed some light on this for me? Thanks, Jeremy ______________________________________________________________________________________________________ This message contains information, which is confidential and may be subject to legal privilege. If you are not the intended recipient, you must not peruse, use, disseminate, distribute or copy this message. If you have received this message in error, please notify us immediately (Phone 0800 665 463 or info@linz.govt.nz) and destroythe original message. LINZ accepts no responsibility for changes to this email, or for any attachments, after its transmission from LINZ. Thank you. ______________________________________________________________________________________________________
Jeremy Palmer <JPalmer@linz.govt.nz> writes: > This is the query that does not use the indexes: > SELECT > coo.nod_id, > 6400000*radians(sqrt((coo.value1 - -41.0618)^2+((coo.value2 - 175.58461)*cos(radians(-41.0618)))^2)) as distance > FROM > crs_coordinate coo > WHERE > coo.value1 between -41.0618-degrees(1200.0/6400000.0) and -41.0618+degrees(1200.0/6400000.0) and > coo.value2 between 175.58461-degrees(1200.0/6400000.0)/(cos(radians(-41.0618))) and 175.58461+degrees(1200.0/6400000.0)/(cos(radians(-41.0618))); Those expressions yield float8, not numeric, and numeric vs float8 isn't an indexable operator for reasons we needn't get into here. You should probably rethink whether numeric is really the best choice of datatype for your columns, if this is the sort of value you expect to work with --- you're paying a considerable price in speed and space for perhaps-illusory precision gains. But if you insist on using numeric then the solution is to cast the expression results to numeric explicitly. BTW I wonder whether you ought not be looking into postgis rather than rolling-your-own coordinate arithmetic ... regards, tom lane
Hi Tom, Thanks for the help - much appreciated. Yes I'm using PostGIS, and with a simply join to a relating table I could get access to the geometry for these point positions.Is using the GIST r-tree index faster than using the 2 b-tree indexes on the lat and long values? I guess thisis a question for the PostGIS guys and a quick test could tell me anyway! My memory is that the GIST r-tree index isslow for points at the moment, and that a good implementation of a kd-tree index over GIST is required for better speed. Regards, Jeremy Palmer Geodetic Surveyor National Geodetic Office Land Information New Zealand | Toitu te whenua 160 Lambton Quay | Private Box 5501 | Wellington 6145 DDI: 64 (0)4 498 3537 | Fax: 64 (0)4 498 3837 | www.linz.govt.nz -----Original Message----- From: Tom Lane [mailto:tgl@sss.pgh.pa.us] Sent: Saturday, 10 July 2010 11:20 a.m. To: Jeremy Palmer Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Index usage with functions in where condition Jeremy Palmer <JPalmer@linz.govt.nz> writes: > This is the query that does not use the indexes: > SELECT > coo.nod_id, > 6400000*radians(sqrt((coo.value1 - -41.0618)^2+((coo.value2 - 175.58461)*cos(radians(-41.0618)))^2)) as distance > FROM > crs_coordinate coo > WHERE > coo.value1 between -41.0618-degrees(1200.0/6400000.0) and -41.0618+degrees(1200.0/6400000.0) and > coo.value2 between 175.58461-degrees(1200.0/6400000.0)/(cos(radians(-41.0618))) and 175.58461+degrees(1200.0/6400000.0)/(cos(radians(-41.0618))); Those expressions yield float8, not numeric, and numeric vs float8 isn't an indexable operator for reasons we needn't get into here. You should probably rethink whether numeric is really the best choice of datatype for your columns, if this is the sort of value you expect to work with --- you're paying a considerable price in speed and space for perhaps-illusory precision gains. But if you insist on using numeric then the solution is to cast the expression results to numeric explicitly. BTW I wonder whether you ought not be looking into postgis rather than rolling-your-own coordinate arithmetic ... regards, tom lane ______________________________________________________________________________________________________ This message contains information, which is confidential and may be subject to legal privilege. If you are not the intended recipient, you must not peruse, use, disseminate, distribute or copy this message. If you have received this message in error, please notify us immediately (Phone 0800 665 463 or info@linz.govt.nz) and destroythe original message. LINZ accepts no responsibility for changes to this email, or for any attachments, after its transmission from LINZ. Thank you. ______________________________________________________________________________________________________