Обсуждение: optimizer
Hi, I have a table named Domain that has 14M rows, here is the defination: xdap_regr=# \d domain Table "domain" Attribute | Type | Modifier -----------------+-------+---------- domainhandle | text | not null domainname | text | not null parentdomain | text | not null holdername | text | street | text | city | text | state | text | postalcode | text | country | text | Indices: idx_domain_domainname, idx_domain_holdername, idx_domain_parentdoamin, pk_domain An index on holdname has been created and the table analized. xdap_regr=# explain select domainname from domain where holdername = 'NAMEZERO.COM'; NOTICE: QUERY PLAN: Seq Scan on domain (cost=0.00..483192.89 rows=1532369 width=12) EXPLAIN Question: Why optimizer doesn't use index scan? below is some data of holdername looks like: xdap_regr=# select holdername form domain limit 10; holdername --------------------------------- SE Electronic Control Sys Inc STONE J. MELET (ASKFORFREE-DOM) DAVID GUERRA A Taste of Lake County DONAHUE SECURITIES tsui, godwin OECONSULT JERRY GRIGORY PHOTOGRAPHY NAMEZERO.COM Carla Cross Seminars, Inc. (10 rows) Can we force optimizer to index scan? Thanks a lot! Anna Zhang
On Wed, 27 Feb 2002, Zhang, Anna wrote: > Hi, > I have a table named Domain that has 14M rows, here is the defination: > > xdap_regr=# \d domain > Table "domain" > Attribute | Type | Modifier > -----------------+-------+---------- > domainhandle | text | not null > domainname | text | not null > parentdomain | text | not null > holdername | text | > street | text | > city | text | > state | text | > postalcode | text | > country | text | > Indices: idx_domain_domainname, > idx_domain_holdername, > idx_domain_parentdoamin, > pk_domain > > An index on holdname has been created and the table analized. > > xdap_regr=# explain select domainname from domain where holdername = > 'NAMEZERO.COM'; > NOTICE: QUERY PLAN: > > Seq Scan on domain (cost=0.00..483192.89 rows=1532369 width=12) Is the estimate above (1.5M rows) reasonable? If so, it's probably doing the right thing. If not, what version are you using and are there any very common values that may throw off the estimates; what does select * from pg_statistic where starelid=(select oid from pg_class where relname='domain'); give? > Can we force optimizer to index scan? set enable_seqscan=off;
>Is the estimate above (1.5M rows) reasonable? If so, it's probably >doing the right thing. If not, what version are you using and are >there any very common values that may throw off the estimates; what >does select * from pg_statistic where starelid=(select oid from >pg_class where relname='domain'); give? select * from pg_statistic where starelid=(select oid from pg_class where relname='domain'); starelid | staattnum | staop | stanullfrac | stacommonfrac | stacommonval | staloval | stahival -----------+-----------+-------+-------------+---------------+-------------- --------------+------------------------------+------------------------------ ----------------------------------- 749413081 | 1 | 664 | 0 | 7.02145e-08 | UPSPWR-DOM | 000000000000000000000-N3-DOM | ZZZZZZZZZZZZZZZZZZZZZ9-DOM 749413081 | 2 | 664 | 0 | 7.02145e-08 | UPSPWR | 00 | ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ 749413081 | 3 | 664 | 0 | 0.728349 | COM | ARPA | WS 749413081 | 4 | 664 | 0 | 0.107595 | NAMEZERO.COM | ` | ambassador blinds 749413081 | 5 | 664 | 0 | 0.107594 | 51 University Ave, Suite K | - | zzzzzzzzzzzzzzzzzz zzzzzzzzzzzzzzzz, zzzzzzzzzzz 325698 749413081 | 6 | 664 | 0.312378 | 0.107716 | LOS GATOS | - | Murray 749413081 | 7 | 664 | 0.312378 | 0.227842 | CA | AA | WY 749413081 | 8 | 664 | 0.312119 | 0.107774 | 95030 | 00005 | 99995 749413081 | 9 | 664 | 0.00980173 | 0.687646 | US | AC | ZW (9 rows) The column holdername is not common, same holdername may have a few records. I am running postgres 7.2, for pg_statistic table I really no idea what each column means, if you get some clue from above query result please give me an advise. Thanks for reply. Anna Zhang
On Wed, 27 Feb 2002, Zhang, Anna wrote: > > >Is the estimate above (1.5M rows) reasonable? If so, it's probably > >doing the right thing. If not, what version are you using and are > >there any very common values that may throw off the estimates; what > >does select * from pg_statistic where starelid=(select oid from > >pg_class where relname='domain'); give? Okay, does running it after set enable_seqscan=false; actually run faster than the sequence scan version? > select * from pg_statistic where starelid=(select oid from pg_class where > relname='domain'); > > starelid | staattnum | staop | stanullfrac | stacommonfrac | > stacommonval | staloval | > > stahival > -----------+-----------+-------+-------------+---------------+-------------- > --------------+------------------------------+------------------------------ > ----------------------------------- > 749413081 | 1 | 664 | 0 | 7.02145e-08 | UPSPWR-DOM > > | 000000000000000000000-N3-DOM | ZZZZZZZZZZZZZZZZZZZZZ9-DOM > 749413081 | 2 | 664 | 0 | 7.02145e-08 | UPSPWR > > | 00 | > ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ > 749413081 | 3 | 664 | 0 | 0.728349 | COM > > | ARPA | WS > 749413081 | 4 | 664 | 0 | 0.107595 | NAMEZERO.COM > > | ` | ambassador blinds > 749413081 | 5 | 664 | 0 | 0.107594 | 51 University > Ave, Suite K | - | zzzzzzzzzzzzzzzzzz > zzzzzzzzzzzzzzzz, zzzzzzzzzzz 325698 > 749413081 | 6 | 664 | 0.312378 | 0.107716 | LOS GATOS > > | - | Murray > 749413081 | 7 | 664 | 0.312378 | 0.227842 | CA > > | AA | WY > 749413081 | 8 | 664 | 0.312119 | 0.107774 | 95030 > > | 00005 | 99995 > 749413081 | 9 | 664 | 0.00980173 | 0.687646 | US > > | AC | ZW > (9 rows) > > The column holdername is not common, same holdername may have a few records. > I am running postgres 7.2, for pg_statistic table I really no idea what each Are you *sure* this is a 7.2 server? The above looks like the form from 7.1 and earlier. In any case, it looks like NAMEZERO.COM is the most common value with about 10% of the table, so I don't think sequence scan is a bad plan necessarily.