Обсуждение: optimizer

Поиск
Список
Период
Сортировка

optimizer

От
"Zhang, Anna"
Дата:
Hi,
I have a table named Domain that has 14M rows, here is the defination:

xdap_regr=# \d domain
         Table "domain"
  Attribute   | Type | Modifier
-----------------+-------+----------
domainhandle     | text |     not null
domainname       | text |     not null
parentdomain     | text |     not null
holdername       | text |
street           | text |
city            | text |
state            | text |
postalcode       | text |
country          | text |
Indices: idx_domain_domainname,
            idx_domain_holdername,
            idx_domain_parentdoamin,
            pk_domain

An index on holdname has been created and the table analized.

xdap_regr=# explain select domainname from domain where holdername =
'NAMEZERO.COM';
NOTICE:  QUERY PLAN:

Seq Scan on domain  (cost=0.00..483192.89 rows=1532369 width=12)

EXPLAIN

Question: Why optimizer doesn't use index scan?  below is some data of
holdername looks like:

xdap_regr=# select holdername form domain limit 10;

           holdername
---------------------------------
SE Electronic Control Sys Inc
STONE J. MELET (ASKFORFREE-DOM)
DAVID GUERRA
A Taste of Lake County
DONAHUE SECURITIES
tsui, godwin
OECONSULT
JERRY GRIGORY PHOTOGRAPHY
NAMEZERO.COM
Carla Cross Seminars, Inc.
(10 rows)

Can we force optimizer to index scan?

Thanks a lot!


Anna Zhang

Re: optimizer

От
Stephan Szabo
Дата:
On Wed, 27 Feb 2002, Zhang, Anna wrote:

> Hi,
> I have a table named Domain that has 14M rows, here is the defination:
>
> xdap_regr=# \d domain
>          Table "domain"
>   Attribute   | Type | Modifier
> -----------------+-------+----------
> domainhandle     | text |     not null
> domainname       | text |     not null
> parentdomain     | text |     not null
> holdername       | text |
> street           | text |
> city            | text |
> state            | text |
> postalcode       | text |
> country          | text |
> Indices: idx_domain_domainname,
>             idx_domain_holdername,
>             idx_domain_parentdoamin,
>             pk_domain
>
> An index on holdname has been created and the table analized.
>
> xdap_regr=# explain select domainname from domain where holdername =
> 'NAMEZERO.COM';
> NOTICE:  QUERY PLAN:
>
> Seq Scan on domain  (cost=0.00..483192.89 rows=1532369 width=12)

Is the estimate above (1.5M rows) reasonable?  If so, it's probably
doing the right thing.  If not, what version are you using and are
there any very common values that may throw off the estimates; what
does select * from pg_statistic where starelid=(select oid from
pg_class where relname='domain'); give?

> Can we force optimizer to index scan?
set enable_seqscan=off;



Re: optimizer

От
"Zhang, Anna"
Дата:
>Is the estimate above (1.5M rows) reasonable?  If so, it's probably
>doing the right thing.  If not, what version are you using and are
>there any very common values that may throw off the estimates; what
>does select * from pg_statistic where starelid=(select oid from
>pg_class where relname='domain'); give?


select * from pg_statistic where starelid=(select oid from pg_class where
relname='domain');

starelid  | staattnum | staop | stanullfrac | stacommonfrac |
stacommonval        |           staloval           |

    stahival
-----------+-----------+-------+-------------+---------------+--------------
--------------+------------------------------+------------------------------
-----------------------------------
749413081 |         1 |   664 |           0 |   7.02145e-08 | UPSPWR-DOM

             | 000000000000000000000-N3-DOM | ZZZZZZZZZZZZZZZZZZZZZ9-DOM
749413081 |         2 |   664 |           0 |   7.02145e-08 | UPSPWR

             | 00                           |
ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ
749413081 |         3 |   664 |           0 |      0.728349 | COM

             | ARPA                         | WS
749413081 |         4 |   664 |           0 |      0.107595 | NAMEZERO.COM

             | `                            |    ambassador blinds
749413081 |         5 |   664 |           0 |      0.107594 | 51 University
Ave, Suite K | -                            | zzzzzzzzzzzzzzzzzz
zzzzzzzzzzzzzzzz, zzzzzzzzzzz 325698
749413081 |         6 |   664 |    0.312378 |      0.107716 | LOS GATOS

             | -                            |  Murray
749413081 |         7 |   664 |    0.312378 |      0.227842 | CA

             | AA                           | WY
749413081 |         8 |   664 |    0.312119 |      0.107774 | 95030

             | 00005                        | 99995
749413081 |         9 |   664 |  0.00980173 |      0.687646 | US

             | AC                           | ZW
(9 rows)

The column holdername is not common, same holdername may have a few records.
I am running postgres 7.2, for pg_statistic table I really no idea what each
column means, if you get some clue from above query result please give me an
advise. Thanks for reply.

Anna Zhang

Re: optimizer

От
Stephan Szabo
Дата:
On Wed, 27 Feb 2002, Zhang, Anna wrote:

>
> >Is the estimate above (1.5M rows) reasonable?  If so, it's probably
> >doing the right thing.  If not, what version are you using and are
> >there any very common values that may throw off the estimates; what
> >does select * from pg_statistic where starelid=(select oid from
> >pg_class where relname='domain'); give?

Okay, does running it after set enable_seqscan=false; actually run faster
than the sequence scan version?

> select * from pg_statistic where starelid=(select oid from pg_class where
> relname='domain');
>
> starelid  | staattnum | staop | stanullfrac | stacommonfrac |
> stacommonval        |           staloval           |
>
>     stahival
> -----------+-----------+-------+-------------+---------------+--------------
> --------------+------------------------------+------------------------------
> -----------------------------------
> 749413081 |         1 |   664 |           0 |   7.02145e-08 | UPSPWR-DOM
>
>              | 000000000000000000000-N3-DOM | ZZZZZZZZZZZZZZZZZZZZZ9-DOM
> 749413081 |         2 |   664 |           0 |   7.02145e-08 | UPSPWR
>
>              | 00                           |
> ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ
> 749413081 |         3 |   664 |           0 |      0.728349 | COM
>
>              | ARPA                         | WS
> 749413081 |         4 |   664 |           0 |      0.107595 | NAMEZERO.COM
>
>              | `                            |    ambassador blinds
> 749413081 |         5 |   664 |           0 |      0.107594 | 51 University
> Ave, Suite K | -                            | zzzzzzzzzzzzzzzzzz
> zzzzzzzzzzzzzzzz, zzzzzzzzzzz 325698
> 749413081 |         6 |   664 |    0.312378 |      0.107716 | LOS GATOS
>
>              | -                            |  Murray
> 749413081 |         7 |   664 |    0.312378 |      0.227842 | CA
>
>              | AA                           | WY
> 749413081 |         8 |   664 |    0.312119 |      0.107774 | 95030
>
>              | 00005                        | 99995
> 749413081 |         9 |   664 |  0.00980173 |      0.687646 | US
>
>              | AC                           | ZW
> (9 rows)
>
> The column holdername is not common, same holdername may have a few records.
> I am running postgres 7.2, for pg_statistic table I really no idea what each

Are you *sure* this is a 7.2 server?  The above looks like the form from
7.1 and earlier.  In any case, it looks like NAMEZERO.COM is the most
common value with about 10% of the table, so I don't think sequence scan
is a bad plan necessarily.