Обсуждение: Very slow select

Поиск
Список
Период
Сортировка

Very slow select

От
Stefano Bargioni
Дата:
I defined a table with two indexed text columns, a and b, of variable length.
This table contains 287195 rows.
SELECT FROM t WHERE a='my string'; works very fast. The same select made on b is very slow (2 or 3 seconds), as if no
indexexists for column b. 
I'd like to know why. Note that b contains very long strings, more than 500 chars. Is there a limit in indexing?
Thanks. Stefano
--
Dott. Stefano Bargioni
Biblioteca della Pontificia Universita' della Santa Croce - Roma
<mailto:bargioni@usc.urbe.it>                  <http://www.usc.urbe.it>
Personal web page:    <http://www.usc.urbe.it/html/php.script?bargioni>
--- "Si apud bibliothecam hortulum habes, nihil deerit"  (Cicerone) ---

Re: Very slow select

От
Tom Lane
Дата:
Stefano Bargioni <bargioni@usc.urbe.it> writes:
> SELECT FROM t WHERE a='my string'; works very fast. The same select
> made on b is very slow (2 or 3 seconds), as if no index exists for
> column b.

What does EXPLAIN say about the two queries?

            regards, tom lane

Re: Very slow select

От
Stefano Bargioni
Дата:
> Stefano Bargioni  writes:
> > SELECT FROM t WHERE a='my string'; works very fast. The same select
> > made on b is very slow (2 or 3 seconds), as if no index exists for
> > column b.
>
> What does EXPLAIN say about the two queries?
>
>                         regards, tom lane
>

Dear Tom, thanks for your help. Here is the explain results.
As you note, the second select is not index driven. This is why it is slow. Are indexes built correctly / used / when
fieldscontain very large text strings? Can an hash index be useful? 
Bye (Merry 2K Xmas!). Stefano

ucc2=> explain select count(*) from document where xtitle='a';
NOTICE:  QUERY PLAN:

Aggregate  (cost=70.61 rows=759 width=4)
  ->  Index Scan using xtitle_index on document  (cost=70.61 rows=759 width=4)

EXPLAIN
ucc2=> explain select count(*) from document where xothertitles='a';
NOTICE:  QUERY PLAN:

Aggregate  (cost=26005.44 rows=287195 width=4)
  ->  Seq Scan on document  (cost=26005.44 rows=287195 width=4)

EXPLAIN
ucc2=> \d document
Table    = document
+----------------------------------+----------------------------------+-------+
|              Field               |              Type                | Length|
+----------------------------------+----------------------------------+-------+
| id                               | int4 not null default nextval('" |     4 |
| title                            | text                             |   var |
| xtitle                           | text                             |   var |
| othertitles                      | text                             |   var |
| xothertitles                     | text                             |   var |
| author                           | text                             |   var |
| xauthor                          | text                             |   var |
| otherauthors                     | text                             |   var |
| xotherauthors                    | text                             |   var |
| subject                          | text                             |   var |
| xsubject                         | text                             |   var |
| publisher                        | text                             |   var |
| xpublisher                       | text                             |   var |
| collection                       | text                             |   var |
| xcollection                      | text                             |   var |
| yearpub                          | int4                             |     4 |
| xyearpub                         | int4                             |     4 |
| locationpub                      | text                             |   var |
| xlocationpub                     | text                             |   var |
| languagecode                     | text                             |   var |
| classification                   | text                             |   var |
| digest                           | text                             |   var |
+----------------------------------+----------------------------------+-------+
Indices:  digest_document_index
          document_id_key
          xauthor_index
          xcollection_index
          xotherauthors_index
          xothertitles_index
          xpublisher_index
          xsubject_index
          xtitle_index

indexes where created with
create index xothertitles_index on document (xothertitles);
create index xtitle_index on document (xtitle);
--
Dott. Stefano Bargioni
Biblioteca della Pontificia Universita' della Santa Croce - Roma
<mailto:bargioni@usc.urbe.it>                  <http://www.usc.urbe.it>
Personal web page:    <http://www.usc.urbe.it/html/php.script?bargioni>
--- "Si apud bibliothecam hortulum habes, nihil deerit"  (Cicerone) ---



Re: Very slow select

От
Tom Lane
Дата:
Stefano Bargioni <bargioni@usc.urbe.it> writes:
>> What does EXPLAIN say about the two queries?

> ucc2=> explain select count(*) from document where xtitle='a';
> NOTICE:  QUERY PLAN:

> Aggregate  (cost=70.61 rows=759 width=4)
>  -> Index Scan using xtitle_index on document  (cost=70.61 rows=759 width=4)

> EXPLAIN
> ucc2=> explain select count(*) from document where xothertitles='a';
> NOTICE:  QUERY PLAN:

> Aggregate  (cost=26005.44 rows=287195 width=4)
>   -> Seq Scan on document  (cost=26005.44 rows=287195 width=4)

The reason for the difference in plan is the difference in estimated
number of rows retrieved.  You probably have some extremely common
value in xothertitles, which is skewing the statistics that the planner
uses for this estimate.

Sometimes, what's skewing the stats is a value that's actually just a
placeholder for "unknown", such as an empty string or 'UNKNOWN' or
something like that.  If possible, I recommend coding your application
to use NULL for this purpose, rather than any ordinary data value.
The stats account for NULLs separately, so the system won't be fooled
by having a lot of them.

            regards, tom lane