Обсуждение: Poor Performance on Postgres 8.0
Hi Folks , I am running this query on postgres 8 beta version and it is not using the right index, where as if i run the same query on postgres 7.4 version it uses the right index . Here are the explain analyze output for both the versions. can anyone explain this ? tks. tables: attribute table has 200k records, string table has 190 records \d common.attribute Table "common.attribute" Column | Type | Modifiers ----------------+-----------------------------+------------------------------------------------------- attributeid | integer | not null default nextval('COMMON.ATTRIBUTESEQ'::text) fknamestringid | integer | not null stringvalue | text | integervalue | integer | numericvalue | numeric(14,2) | datevalue | timestamp without time zone | booleanvalue | boolean | bigstringvalue | text | Indexes: "pk_attribute_attributeid" primary key, btree (attributeid) "uk_attribute_fkstringid_stringvalue_integervalue_numericvalue_d" unique, btree (fknamestringid, stringvalue, integervalue, numericvalue, datevalue) "idx_attribute_fknamestringid" btree (fknamestringid) Foreign-key constraints: "fk_attribute_string" FOREIGN KEY (fknamestringid) REFERENCES common.string(stringid) \d common.string Table "common.string" Column | Type | Modifiers ----------+---------+---------------------------------------------------- stringid | integer | not null default nextval('COMMON.STRINGSEQ'::text) value | text | Indexes: "pk_string_stringid" primary key, btree (stringid) Query select attribute0_.attributeid as attribut1_, attribute0_.stringvalue as stringva2_, attribute0_.bigStringvalue as bigStrin3_, attribute0_.integervalue as integerv4_, attribute0_.numericvalue as numericv5_, attribute0_.datevalue as datevalue, attribute0_.booleanvalue as booleanv7_, attribute0_.fknamestringid as fknamest8_ from common.attribute attribute0_, common.string text1_ where (text1_.value='squareFeet' and attribute0_.fknamestringid=text1_.stringid) and (numericValue='775.0') Explain Analyze from 7.4 QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=0.00..501.96 rows=1 width=100) (actual time=127.420..135.914 rows=1 loops=1) -> Seq Scan on string text1_ (cost=0.00..12.31 rows=2 width=4) (actual time=68.421..68.466 rows=1 loops=1) Filter: (value = 'squareFeet'::text) -> Index Scan using idx_attribute_fknamestringid on attribute attribute0_ (cost=0.00..244.81 rows=1 width=100) (actual time=58.963..67.406 rows=1 loops=1) Index Cond: (attribute0_.fknamestringid = "outer".stringid) Filter: (numericvalue = 775.0) Total runtime: 136.056 ms Explain Analyze from 8 beta QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=0.00..5440.85 rows=1 width=109) (actual time=27.313..440.469 rows=1 loops=1) -> Seq Scan on attribute attribute0_ (cost=0.00..5437.82 rows=1 width=109) (actual time=26.987..440.053 rows=2 loops=1) Filter: (numericvalue = 775.0) -> Index Scan using pk_string_stringid on string text1_ (cost=0.00..3.02 rows=1 width=4) (actual time=0.169..0.172 rows=0 loops=2) Index Cond: ("outer".fknamestringid = text1_.stringid) Filter: (value = 'squareFeet'::text) Total runtime: 440.648 ms
Pallav Kalva <pkalva@deg.cc> writes: > I am running this query on postgres 8 beta version and it is not > using the right index, where as if i run the same query on postgres 7.4 > version it uses the right index . 1. Beta which, exactly? 2. Have you ANALYZEd both tables lately? 3. If so, try this to see what it thinks the cost of the reverse plan is: begin; alter table common.string drop constraint pk_string_stringid; explain analyze ... same query ... rollback; regards, tom lane
Tom Lane wrote: >Pallav Kalva <pkalva@deg.cc> writes: > > >> I am running this query on postgres 8 beta version and it is not >>using the right index, where as if i run the same query on postgres 7.4 >>version it uses the right index . >> >> > >1. Beta which, exactly? > Beta 4 > >2. Have you ANALYZEd both tables lately? > Yes > >3. If so, try this to see what it thinks the cost of the reverse plan >is: > > begin; > alter table common.string drop constraint pk_string_stringid; > explain analyze ... same query ... > rollback; > what do u mean by rollback exactly ? i can drop the pk constraint and run explain analyze and see how it behaves. > > regards, tom lane > > >
Pallav Kalva <pkalva@deg.cc> writes: >> begin; >> alter table common.string drop constraint pk_string_stringid; >> explain analyze ... same query ... >> rollback; >> > what do u mean by rollback exactly ? i can drop the pk constraint > and run explain analyze and see how it behaves. The point of the rollback is that you don't really make the pk constraint go away. It is gone from the perspective of the EXPLAIN, but after you rollback it's back again. Easier than rebuilding it... regards, tom lane
Hi Tom, I dropped the primary key constraint and ran the explain analyze on the same query and here is what i get seq scans on both the tables , still doesnt make use of the index on common.attribute table . QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=0.00..5609.19 rows=1 width=104) (actual time=11.875..319.358 rows=1 loops=1) Join Filter: ("outer".fknamestringid = "inner".stringid) -> Seq Scan on attribute attribute0_ (cost=0.00..5604.76 rows=1 width=104) (actual time=11.541..318.649 rows=2 loops=1) Filter: (numericvalue = 775.0) -> Seq Scan on string text1_ (cost=0.00..4.41 rows=1 width=4) (actual time=0.277..0.319 rows=1 loops=2) Filter: (value = 'squareFeet'::text) Total runtime: 319.496 ms Tom Lane wrote: >Pallav Kalva <pkalva@deg.cc> writes: > > >>>begin; >>>alter table common.string drop constraint pk_string_stringid; >>>explain analyze ... same query ... >>>rollback; >>> >>> >>> >> what do u mean by rollback exactly ? i can drop the pk constraint >>and run explain analyze and see how it behaves. >> >> > >The point of the rollback is that you don't really make the pk >constraint go away. It is gone from the perspective of the EXPLAIN, >but after you rollback it's back again. Easier than rebuilding it... > > regards, tom lane > > >
Pallav Kalva <pkalva@deg.cc> writes: > still doesnt make use of the index on common.attribute table . What do you get from just plain explain analyze select * from common.string text1_ where text1_.value='squareFeet'; I get the impression that it must think this will yield a lot of rows. regards, tom lane
explain analyze select * from common.string text1_ where text1_.value='squareFeet'; QUERY PLAN -------------------------------------------------------------------------------------------------------- Seq Scan on string text1_ (cost=0.00..4.41 rows=1 width=21) (actual time=0.283..0.322 rows=1 loops=1) Filter: (value = 'squareFeet'::text) Total runtime: 0.492 ms I am not worried about this table as common.string has only 190 records, where as the other table common.attribute which is very big (200k records) i want it to use index scan on it . The matching column in common.attribute table has only 175 distinct records in common.attribute table , do you think that's the problem ? here is the full query again select attribute0_.attributeid as attribut1_, attribute0_.stringvalue as stringva2_, attribute0_.bigStringvalue as bigStrin3_, attribute0_.integervalue as integerv4_, attribute0_.numericvalue as numericv5_, attribute0_.datevalue as datevalue, attribute0_.booleanvalue as booleanv7_, attribute0_.fknamestringid as fknamest8_ from common.attribute attribute0_, common.string text1_ where (text1_.value='squareFeet' and attribute0_.fknamestringid=text1_.stringid) and (numericValue='775.0') Tom Lane wrote: >Pallav Kalva <pkalva@deg.cc> writes: > > >>still doesnt make use of the index on common.attribute table . >> >> > >What do you get from just plain > >explain analyze select * from common.string text1_ >where text1_.value='squareFeet'; > >I get the impression that it must think this will yield a lot of rows. > > regards, tom lane > > >
I was able to duplicate this behavior with dummy data that had only a few distinct values for fknamestringid --- the planner then thinks that the index probe into attribute will match a lot of rows and hence take a long time. Could we see your pg_stats row for fknamestringid, ie select * from pg_stats where tablename = 'attribute' and attname = 'fknamestringid'; It would be interesting to see the same for your 7.4 installation too. regards, tom lane
On 7.4 I get select * from pg_stats where tablename = 'attribute' and attname = 'fknamestringid'; schemaname | tablename | attname | null_frac | avg_width | n_distinct | most_common_vals | most_common_freqs | histogram_bounds | correlation ------------+-----------+----------------+-----------+-----------+------------+-----------------------------------------------------+-------------------------------------------------------------------------------------+----------------------------------------------------------+------------- common | attribute | fknamestringid | 0 | 4 | 124 | {2524,2434,2523,2599,2595,2592,2596,2528,2586,2446} | {0.132333,0.13,0.0766667,0.0373333,0.0366667,0.0333333,0.031,0.029,0.0263333,0.019} | {2433,2441,2455,2462,2473,2479,2484,2492,2505,2574,2598} | -0.22864 (1 row) On 8 select * from pg_stats where tablename = 'attribute' and attname = 'fknamestringid'; schemaname | tablename | attname | null_frac | avg_width | n_distinct | most_common_vals | most_common_freqs | histogram_bounds | correlation ------------+-----------+----------------+-----------+-----------+------------+-----------------------------------------------------+-----------------------------------------------------------------------------------+----------------------------------------------------------+------------- common | attribute | fknamestringid | 0 | 4 | 80 | {2524,2434,2530,2522,2525,2523,2527,2526,2574,2531} | {0.219333,0.199333,0.076,0.0643333,0.0616667,0.05,0.0453333,0.042,0.04,0.0286667} | {2437,2528,2529,2538,2539,2540,2554,2562,2575,2584,2637} | 0.0274016 Tom Lane wrote: >I was able to duplicate this behavior with dummy data that had only a >few distinct values for fknamestringid --- the planner then thinks that >the index probe into attribute will match a lot of rows and hence take a >long time. Could we see your pg_stats row for fknamestringid, ie > >select * from pg_stats >where tablename = 'attribute' and attname = 'fknamestringid'; > >It would be interesting to see the same for your 7.4 installation too. > > regards, tom lane > > >
Pallav Kalva <pkalva@deg.cc> writes: > On 8 > common | attribute | fknamestringid | 0 | 4 > | 80 | {2524,2434,2530,2522,2525,2523,2527,2526,2574,2531} | > {0.219333,0.199333,0.076,0.0643333,0.0616667,0.05,0.0453333,0.042,0.04,0.0286667} > | {2437,2528,2529,2538,2539,2540,2554,2562,2575,2584,2637} | 0.0274016 Given those stats, the planner is going to estimate that about 1/80th of the attribute table matches any particular fknamestringid, and that's what's driving it away from using the indexscan. I cannot tell whether there are indeed a couple of thousand rows joining to the 'squareFeet' string row (in which case the condition numericValue='775.0' must be really selective) or whether this is an outlier case that joins to just a few attribute rows. The slightly different stats values for 7.4 would have given it a slightly lower value for the cost of an indexscan by idx_attribute_fknamestringid, but certainly not as low as your original message shows. Perhaps you have some difference in parameter settings in your 7.4 installation --- most likely a lower random_page_cost. regards, tom lane
The random_page_cost value is same on both the versions, the only thing difference between 7.4 version and 8 version is that 7.4 ver has 100k less records. For, now i created index on numericvalue column on attribute table and it used that index and it is much faster that way. it came down to 24msec. Also, i tried to see the matching id for squarefeet in attribute table there are 800 some records in attribute table for 8 version and 700 something in 7.4 version. Tom Lane wrote: >Pallav Kalva <pkalva@deg.cc> writes: > > >>On 8 >> common | attribute | fknamestringid | 0 | 4 >>| 80 | {2524,2434,2530,2522,2525,2523,2527,2526,2574,2531} | >>{0.219333,0.199333,0.076,0.0643333,0.0616667,0.05,0.0453333,0.042,0.04,0.0286667} >>| {2437,2528,2529,2538,2539,2540,2554,2562,2575,2584,2637} | 0.0274016 >> >> > >Given those stats, the planner is going to estimate that about 1/80th of >the attribute table matches any particular fknamestringid, and that's >what's driving it away from using the indexscan. I cannot tell whether >there are indeed a couple of thousand rows joining to the 'squareFeet' >string row (in which case the condition numericValue='775.0' must be >really selective) or whether this is an outlier case that joins to just >a few attribute rows. > >The slightly different stats values for 7.4 would have given it a >slightly lower value for the cost of an indexscan by >idx_attribute_fknamestringid, but certainly not as low as your original >message shows. Perhaps you have some difference in parameter settings >in your 7.4 installation --- most likely a lower random_page_cost. > > regards, tom lane > >---------------------------(end of broadcast)--------------------------- >TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) > > >