Обсуждение: function based index problem

Поиск
Список
Период
Сортировка

function based index problem

От
Viktor Bojović
Дата:
Hi,
on table entry (17M records) there is one index:

CREATE INDEX ndxlen
  ON uniprot_frekvencije.entry
  USING btree
  (length(sequence::text));

When using ">=" in search which returns only two records, query runs much (hundred times) slower. i don't know why it doesn't use index scan. I just wanted to ask how can i modify the query to use that index? Explain plans are pasted below.

bioinf=> explain select * from entry where length(sequence)=36805;
                                 QUERY PLAN                                
----------------------------------------------------------------------------
 Bitmap Heap Scan on entry  (cost=1523.54..294886.26 rows=81226 width=1382)
   Recheck Cond: (length((sequence)::text) = 36805)
   ->  Bitmap Index Scan on ndxlen  (cost=0.00..1503.23 rows=81226 width=0)
         Index Cond: (length((sequence)::text) = 36805)
(4 rows)

bioinf=> explain select * from entry where length(sequence)>=36805;
                             QUERY PLAN                            
--------------------------------------------------------------------
 Seq Scan on entry  (cost=0.00..5400995.21 rows=5415049 width=1382)
   Filter: (length((sequence)::text) >= 36805)
(2 rows)

Thanx in advance
--
---------------------------------------
Viktor Bojović
---------------------------------------
Wherever I go, Murphy goes with me

Re: function based index problem

От
"David Johnston"
Дата:

 

From: pgsql-sql-owner@postgresql.org [mailto:pgsql-sql-owner@postgresql.org] On Behalf Of Viktor Bojovic
Sent: Wednesday, August 31, 2011 5:27 PM
To: pgsql-sql@postgresql.org; pgsql-admin@postgresql.org
Subject: [SQL] function based index problem

 

Hi,
on table entry (17M records) there is one index:

CREATE INDEX ndxlen
  ON uniprot_frekvencije.entry
  USING btree
  (length(sequence::text));

When using ">=" in search which returns only two records, query runs much (hundred times) slower. i don't know why it doesn't use index scan. I just wanted to ask how can i modify the query to use that index? Explain plans are pasted below.

bioinf=> explain select * from entry where length(sequence)=36805;
                                 QUERY PLAN                                
----------------------------------------------------------------------------
 Bitmap Heap Scan on entry  (cost=1523.54..294886.26 rows=81226 width=1382)
   Recheck Cond: (length((sequence)::text) = 36805)
   ->  Bitmap Index Scan on ndxlen  (cost=0.00..1503.23 rows=81226 width=0)
         Index Cond: (length((sequence)::text) = 36805)
(4 rows)

bioinf=> explain select * from entry where length(sequence)>=36805;
                             QUERY PLAN                            
--------------------------------------------------------------------
 Seq Scan on entry  (cost=0.00..5400995.21 rows=5415049 width=1382)
   Filter: (length((sequence)::text) >= 36805)
(2 rows)

Thanx in advance
--
---------------------------------------
Viktor Bojović
---------------------------------------
Wherever I go, Murphy goes with me

 

Some observations/suggestions:

 

Please do not Cross-Post

You have not provided your PostgreSQL version

 

You state the “>=” query only returns 2 rows but the plan expects to return 5.4 MILLION – with that many results Sequential Scan is going to be faster than an Index

Either you have not run “ANALYZE” or you have more data than you think matching your criteria.  Try “EXPLAIN ANALYZE” to actually run the query and see what you get.

 

It is likely that a simple ANALYZE on the table will solve your problem (ALWAYS RUN ANALYZE BEFORE POSTING QUESTIONS LIKE THIS); in the unlikely event it does not please post the “EXPLAIN ANALYZE” results so we can see exactly how many records each query returned.

 

David J.

 

Re: function based index problem

От
Viktor Bojović
Дата:


On Thu, Sep 1, 2011 at 12:09 AM, David Johnston <polobo@yahoo.com> wrote:

 

From: pgsql-sql-owner@postgresql.org [mailto:pgsql-sql-owner@postgresql.org] On Behalf Of Viktor Bojovic
Sent: Wednesday, August 31, 2011 5:27 PM
To: pgsql-sql@postgresql.org; pgsql-admin@postgresql.org
Subject: [SQL] function based index problem

 

Hi,
on table entry (17M records) there is one index:

CREATE INDEX ndxlen
  ON uniprot_frekvencije.entry
  USING btree
  (length(sequence::text));

When using ">=" in search which returns only two records, query runs much (hundred times) slower. i don't know why it doesn't use index scan. I just wanted to ask how can i modify the query to use that index? Explain plans are pasted below.

bioinf=> explain select * from entry where length(sequence)=36805;
                                 QUERY PLAN                                
----------------------------------------------------------------------------
 Bitmap Heap Scan on entry  (cost=1523.54..294886.26 rows=81226 width=1382)
   Recheck Cond: (length((sequence)::text) = 36805)
   ->  Bitmap Index Scan on ndxlen  (cost=0.00..1503.23 rows=81226 width=0)
         Index Cond: (length((sequence)::text) = 36805)
(4 rows)

bioinf=> explain select * from entry where length(sequence)>=36805;
                             QUERY PLAN                            
--------------------------------------------------------------------
 Seq Scan on entry  (cost=0.00..5400995.21 rows=5415049 width=1382)
   Filter: (length((sequence)::text) >= 36805)
(2 rows)

Thanx in advance
--
---------------------------------------
Viktor Bojović
---------------------------------------
Wherever I go, Murphy goes with me

 

Some observations/suggestions:

 

Please do not Cross-Post

You have not provided your PostgreSQL version

 

You state the “>=” query only returns 2 rows but the plan expects to return 5.4 MILLION – with that many results Sequential Scan is going to be faster than an Index

Either you have not run “ANALYZE” or you have more data than you think matching your criteria.  Try “EXPLAIN ANALYZE” to actually run the query and see what you get.

 

It is likely that a simple ANALYZE on the table will solve your problem (ALWAYS RUN ANALYZE BEFORE POSTING QUESTIONS LIKE THIS); in the unlikely event it does not please post the “EXPLAIN ANALYZE” results so we can see exactly how many records each query returned.

 

David J.

 


It works now after "analyze entry" was executed. thanx a lot.


--
---------------------------------------
Viktor Bojović
---------------------------------------
Wherever I go, Murphy goes with me

Re: [ADMIN] function based index problem

От
"Kevin Grittner"
Дата:
Viktor Bojovi*<viktor.bojovic@gmail.com> wrote:

> i don't know why it doesn't use index scan.

Because it thinks your query will return 81226 rows.  We need more
information to make many suggestions beyond "make sure the table has
been analyzed".

http://wiki.postgresql.org/wiki/SlowQueryQuestions

Also, it's not considered good form to post an issue to more than
one PostgreSQL list.  This is probably most appropriate on the
pgsql-performance list, so if you want to follow up with more detail
to get more detailed suggestions, it would be best to abandon this
thread and start a new one on that list.

-Kevin