Обсуждение: select count(*) from anIntColumn where int_value = 0; is very slow

Поиск
Список
Период
Сортировка

select count(*) from anIntColumn where int_value = 0; is very slow

От
David Teran
Дата:
Hi

we have a table with about 4 million rows. One column has an int value,
there is a btree index on it. We tried to execute the following
statement and it is very slow on a dual G5 2GHZ with 4 GB of RAM.

explain analyze select count(*) from job_property where int_value = 0;

Aggregate  (cost=144348.80..144348.80 rows=1 width=0) (actual
time=13536.852..13536.852 rows=1 loops=1)
   ->  Seq Scan on job_property  (cost=0.00..144255.15 rows=37459
width=0) (actual time=19.422..13511.653 rows=42115 loops=1)
         Filter: (int_value = 0)
Total runtime: 13560.862 ms



Is this more or less normal or can we optimize this a little bit?
FrontBase (which we compare currently) takes 2 seconds first time and
about 0.2 seconds on second+ queries.

regards David


Re: select count(*) from anIntColumn where int_value = 0;

От
Pavel Stehule
Дата:
Hello,

If you has  index on id, then you can use
SELECT id FROM tabulka ORDER BY id DESC LIMIT 1;

See 4.8. FAQ

Regards
Pavel Stehule

On Wed, 11 Feb 2004, David Teran wrote:

> Hi
>
> we have a table with about 4 million rows. One column has an int value,
> there is a btree index on it. We tried to execute the following
> statement and it is very slow on a dual G5 2GHZ with 4 GB of RAM.
>
> explain analyze select count(*) from job_property where int_value = 0;
>
> Aggregate  (cost=144348.80..144348.80 rows=1 width=0) (actual
> time=13536.852..13536.852 rows=1 loops=1)
>    ->  Seq Scan on job_property  (cost=0.00..144255.15 rows=37459
> width=0) (actual time=19.422..13511.653 rows=42115 loops=1)
>          Filter: (int_value = 0)
> Total runtime: 13560.862 ms
>
>
>
> Is this more or less normal or can we optimize this a little bit?
> FrontBase (which we compare currently) takes 2 seconds first time and
> about 0.2 seconds on second+ queries.
>
> regards David
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
>                http://archives.postgresql.org
>


Re: select count(*) from anIntColumn where int_value = 0; is very slow

От
"Rigmor Ukuhe"
Дата:
>
> Hi
>
> we have a table with about 4 million rows. One column has an int value,
> there is a btree index on it. We tried to execute the following
> statement and it is very slow on a dual G5 2GHZ with 4 GB of RAM.
>
> explain analyze select count(*) from job_property where int_value = 0;
>
> Aggregate  (cost=144348.80..144348.80 rows=1 width=0) (actual
> time=13536.852..13536.852 rows=1 loops=1)
>    ->  Seq Scan on job_property  (cost=0.00..144255.15 rows=37459
> width=0) (actual time=19.422..13511.653 rows=42115 loops=1)
>          Filter: (int_value = 0)
> Total runtime: 13560.862 ms


Is your int_value data type int4? If not then use "... from job_property
where int_value = '0'"
Indexes are used only if datatypes matches.

Rigmor Ukuhe


>
>
>
> Is this more or less normal or can we optimize this a little bit?
> FrontBase (which we compare currently) takes 2 seconds first time and
> about 0.2 seconds on second+ queries.
>
> regards David
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?

---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.564 / Virus Database: 356 - Release Date: 19.01.2004


Re: select count(*) from anIntColumn where int_value = 0; is very slow

От
David Teran
Дата:
Hi,

> Is your int_value data type int4? If not then use "... from
> job_property
> where int_value = '0'"
> Indexes are used only if datatypes matches.
>
tried those variations already. Strange enough, after dropping and
recreating the index everything worked fine.

regards David


Re: select count(*) from anIntColumn where int_value = 0; is very slow

От
PC Drew
Дата:
Had you done a VACUUM ANALYZE at all?  There has been much discussion
lately about the planner needing to be updated to know that the index
is a better choice.

On Feb 11, 2004, at 6:32 AM, David Teran wrote:

> Hi,
>
>> Is your int_value data type int4? If not then use "... from
>> job_property
>> where int_value = '0'"
>> Indexes are used only if datatypes matches.
>>
> tried those variations already. Strange enough, after dropping and
> recreating the index everything worked fine.
>
> regards David
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to
> majordomo@postgresql.org
>

--
PC Drew
Manager, Dominet

IBSN
1600 Broadway, Suite 400
Denver, CO 80202

Phone: 303-984-4727 x107
Cell: 720-841-4543
Fax: 303-984-4730
Email: drewpc@ibsncentral.com


Re: select count(*) from anIntColumn where int_value = 0;

От
Christopher Browne
Дата:
Oops! stehule@kix.fsv.cvut.cz (Pavel Stehule) was seen spray-painting on a wall:
>
> Regards
> Pavel Stehule
>
> On Wed, 11 Feb 2004, David Teran wrote:
>
>> Hi
>>
>> we have a table with about 4 million rows. One column has an int value,
>> there is a btree index on it. We tried to execute the following
>> statement and it is very slow on a dual G5 2GHZ with 4 GB of RAM.
>>
>> explain analyze select count(*) from job_property where int_value = 0;
>>
>> Aggregate  (cost=144348.80..144348.80 rows=1 width=0) (actual
>> time=13536.852..13536.852 rows=1 loops=1)
>>    ->  Seq Scan on job_property  (cost=0.00..144255.15 rows=37459
>> width=0) (actual time=19.422..13511.653 rows=42115 loops=1)
>>          Filter: (int_value = 0)
>> Total runtime: 13560.862 ms
>>
> If you has  index on id, then you can use
> SELECT id FROM tabulka ORDER BY id DESC LIMIT 1;
>
> See 4.8. FAQ

I'm afraid that's not the answer.  That would be the faster
alternative to "select max(id) from tabulka;"

I guess the question is, is there a faster way of coping with the
"int_value = 0" part?

It seems a little odd that the index was not selected; it appears that
the count was 42115, right?

The estimated number of rows was 37459, and if the table size is ~4M,
then I would have expected the query optimizer to use the index.

Could you try doing "ANALYZE JOB_PROPERTY;" and then try again?

One thought that comes to mind is that perhaps the statistics are
outdated.

Another thought is that perhaps there are several really common
values, and the statistics are crummy.  You might relieve that by:

  alter table job_property alter column int_value set statistics 20;
  analyze job_property;

(Or perhaps some higher value...)

If there are a few very common discrete values in a particular field,
then the default statistics may get skewed because the histogram
hasn't enough bins...
--
let name="cbbrowne" and tld="acm.org" in name ^ "@" ^ tld;;
http://cbbrowne.com/info/wp.html
Rules of  the Evil  Overlord #102.  "I will not  waste time  making my
enemy's death look  like an accident -- I'm  not accountable to anyone
and my other enemies wouldn't believe it.

Re: select count(*) from anIntColumn where int_value = 0;

От
"scott.marlowe"
Дата:
On Wed, 11 Feb 2004, David Teran wrote:

> Hi,
>
> > Is your int_value data type int4? If not then use "... from
> > job_property
> > where int_value = '0'"
> > Indexes are used only if datatypes matches.
> >
> tried those variations already. Strange enough, after dropping and
> recreating the index everything worked fine.

Has that table been updated a lot in its life?  If so, it may have had a
problem with index bloat...


Re: select count(*) from anIntColumn where int_value =

От
Christopher Kings-Lynne
Дата:
>>>Is your int_value data type int4? If not then use "... from
>>>job_property
>>>where int_value = '0'"
>>>Indexes are used only if datatypes matches.
>>>
>>
>>tried those variations already. Strange enough, after dropping and
>>recreating the index everything worked fine.
>
>
> Has that table been updated a lot in its life?  If so, it may have had a
> problem with index bloat...

Try creating a partial index: create index blah on tablw where int_value=0;

Chris