Обсуждение: Strange discrepancy in query performance...
I'm hoping that someone on the list can help me understand an apparent discrepancy in the performance information that I'm collecting on a particularly troublesome query. The configuration: pg-8.2.4 on RHEL4. log_min_duration_statement = 1m. In my syslog output, I see entries indicating that the JDBC-driver-originated query on a table named 'city_summary' are taking upwards of 300 seconds: Oct 1 18:27:47 srv3 postgres-8.2[1625]: [12-1] LOG: duration: 307077.037 ms execute S_42: select * from city_summary where state = $1 and city_master_id = $2 and res_type = 'single_family' and date = $3 and range = 90 and zip = $4 and quartile = '__ALL' DETAIL: parameters: $1 = 'CA', $2 = '291', $3 = '2007-09-28', $4 = '__ALL' However, if I run the same query on the same host at the same time that the Java application is running, but from the psql command line, it takes only 0.37 seconds: > time /opt/postgres-8.2.4/bin/psql --port 54824 -U postgres -d altos_research -c 'select fact_id from city_summary where state = \'CA\' and city_master_id = 291 and zip = \'__ALL\' and quartile = \'__ALL\' and res_type = \'single_family\' and range = \'90\' and date = \'2007-09-28\';' fact_id ---------- 46624925 (1 row) 0.00user 0.00system 0:00.37elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+285minor)pagefaults 0swaps The output of 'explain' seems to indicate that the right index is being used: QUERY PLAN Index Scan using city_summary_pkey on city_summary (cost=0.00..12.27 rows=1 width=2721) Index Cond: ((date = '2007-09-28'::text) AND (state = 'CA'::text) AND (city_master_id = 334::bigint) AND (quartile = '__ALL'::text) AND (range = '90'::text)) Filter: ((zip = '__ALL'::text) AND ((res_type)::text = 'single_family'::text)) (3 rows) The index looks like this: # \d city_summary_pkey Index "public.city_summary_pkey" Column | Type --------------------+--------- date | text state | text city_master_id | bigint zip_master_id | integer res_type_master_id | bigint quartile | text range | text primary key, btree, for table "public.city_summary" Any ideas on why I am seeing such a big difference between the two measurements (JDBC/syslog vs. command line)? Thanks, Jason
"Jason L. Buberel" <jason@buberel.org> writes: > In my syslog output, I see entries indicating that the > JDBC-driver-originated query on a table named 'city_summary' are taking > upwards of 300 seconds: > Oct 1 18:27:47 srv3 postgres-8.2[1625]: [12-1] > LOG: duration: 307077.037 ms execute S_42: select * from city_summary > where state = $1 and city_master_id = $2 and res_type = > 'single_family' and date = $3 > and range = 90 and zip = $4 and quartile = '__ALL' > DETAIL: parameters: $1 = 'CA', $2 = '291', $3 = '2007-09-28', $4 = '__ALL' > However, if I run the same query on the same host at the same time that > the Java application is running, but from the psql command line, it > takes only 0.37 seconds: >>> time /opt/postgres-8.2.4/bin/psql --port 54824 -U postgres -d > altos_research -c 'select fact_id from city_summary where state = > \'CA\' and city_master_id = 291 and zip = \'__ALL\' and quartile = > \'__ALL\' and res_type = \'single_family\' and range = \'90\' and date = > \'2007-09-28\';' This is not, in fact, the same query --- the JDBC-originated one is parameterized, which means it very possibly has a different plan (since the planner doesn't know the particular values to plan for). Try using PREPARE and EXPLAIN EXECUTE to examine the plan that is being produced for the parameterized query. regards, tom lane
Tom-right-as-usual:
Yep - you were right about the query plan for the prepared statement (a sequential scan of the table) differed a bit from the directly-executed version :)
For reference, when using JasperReports .jrxml files as the basis for the query, I only had to do to the following to 'force' postgres to treat the jasper report parameter as a number and not text, thereby allowing the correct index to be used:
select * from city summary where city_master_id = $P{city_master_id}::bigint ...
Query times went from 300+ seconds back down to ~100ms.
-jason
Tom Lane wrote:
Yep - you were right about the query plan for the prepared statement (a sequential scan of the table) differed a bit from the directly-executed version :)
For reference, when using JasperReports .jrxml files as the basis for the query, I only had to do to the following to 'force' postgres to treat the jasper report parameter as a number and not text, thereby allowing the correct index to be used:
select * from city summary where city_master_id = $P{city_master_id}::bigint ...
Query times went from 300+ seconds back down to ~100ms.
-jason
Tom Lane wrote:
"Jason L. Buberel" <jason@buberel.org> writes:In my syslog output, I see entries indicating that the JDBC-driver-originated query on a table named 'city_summary' are taking upwards of 300 seconds:Oct 1 18:27:47 srv3 postgres-8.2[1625]: [12-1] LOG: duration: 307077.037 ms execute S_42: select * from city_summary where state = $1 and city_master_id = $2 and res_type = 'single_family' and date = $3 and range = 90 and zip = $4 and quartile = '__ALL' DETAIL: parameters: $1 = 'CA', $2 = '291', $3 = '2007-09-28', $4 = '__ALL'However, if I run the same query on the same host at the same time that the Java application is running, but from the psql command line, it takes only 0.37 seconds:time /opt/postgres-8.2.4/bin/psql --port 54824 -U postgres -daltos_research -c 'select fact_id from city_summary where state = \'CA\' and city_master_id = 291 and zip = \'__ALL\' and quartile = \'__ALL\' and res_type = \'single_family\' and range = \'90\' and date = \'2007-09-28\';'This is not, in fact, the same query --- the JDBC-originated one is parameterized, which means it very possibly has a different plan (since the planner doesn't know the particular values to plan for). Try using PREPARE and EXPLAIN EXECUTE to examine the plan that is being produced for the parameterized query. regards, tom lane
"Jason L. Buberel" <jason@buberel.org> writes: > For reference, when using JasperReports .jrxml files as the basis for > the query, I only had to do to the following to 'force' postgres to > treat the jasper report parameter as a number and not text, thereby > allowing the correct index to be used: > select * from city summary where city_master_id = > $P{city_master_id}::bigint ... > Query times went from 300+ seconds back down to ~100ms. Hmm ... if Postgres were just given the parameter symbol with no type information, I believe it would have assumed it was bigint (or in general, the same type as what it's being compared to). So your problem suggests that Jasper is deliberately telling the backend that that parameter is of type text. If that's coming from something you did in your code, you probably ought to change the code. If not, it seems like a bug/omission in Jasper. regards, tom lane
I agree that this is a bug in JasperReports. I've been stepping throgh their code to determine where the paramter type is set to 'java.lang.String', but have not yet figured out how their Java API will allow me to override that with 'java.lang.Integer' or something more appropriate.
If I figure something out, I'll post to the list.
Regards,
jason
Tom Lane wrote:
If I figure something out, I'll post to the list.
Regards,
jason
Tom Lane wrote:
Hmm ... if Postgres were just given the parameter symbol with no type information, I believe it would have assumed it was bigint (or in general, the same type as what it's being compared to). So your problem suggests that Jasper is deliberately telling the backend that that parameter is of type text. If that's coming from something you did in your code, you probably ought to change the code. If not, it seems like a bug/omission in Jasper. regards, tom lane