Обсуждение: Long count(*) time

Поиск

Список

Период

Сортировка

Long count(*) time

От

"David Monarchi"

Дата:

25 сентября 2007 г., 17:02:57

Hello -

I'm running PG 8.2 on an 8-processor 16G Unix machine. The machine is dedicated to the db, and only 5 threads/processors are busy. The following query takes 70 seconds to execute.
select count(*) from url_list_url;
There are 64,219,173 rows in the table. The table consists of an integer field and a text field. The average length of the text field is 50 characters. There are btree indexes on both fields. The integer field is the key.

70 seconds seems to be a long time for this kind of query. Is this normal?

Thanks.

David

Re: Long count(*) time

От

Jon Sime

Дата:

25 сентября 2007 г., 17:33:48

David Monarchi wrote:
> Hello -
>
> I'm running PG 8.2 on an 8-processor 16G Unix machine.  The machine is
> dedicated to the db, and only 5 threads/processors are busy.  The
> following query takes 70 seconds to execute.
>      select count(*) from url_list_url;
> There are 64,219,173 rows in the table.  The table consists of an
> integer field and a text field.  The average length of the text field is
> 50 characters.  There are btree indexes on both fields.  The integer
> field is the key.
>
> 70 seconds seems to be a long time for this kind of query.  Is this normal?
>
> Thanks.
>
> David

Unless I misplaced a decimal point, that works out to a tad over
52MBytes/sec which would not be unreasonable for a lower-end disk
subsystem (the details of which you didn't mention at all).

Now, if you're simply wondering "Why does count(*) take so long?" you
may want to search the list archives. It's a question that's been asked
and answered many times and a search on something like "slow count" will
provide hours of reading material.

The very short answer: An unrestricted count(*) must, by the nature of
the current MVCC implementation used by PostgreSQL, read the entire
table. An index cannot be used (well, it can, but using it would slow
things down even further).

-Jon

--
Senior Systems Developer
Media Matters for America
http://mediamatters.org/

Re: Long count(*) time

От

Richard Broersma Jr

Дата:

25 сентября 2007 г., 18:52:01

--- David Monarchi <david.e.monarchi@gmail.com> wrote:

> 70 seconds seems to be a long time for this kind of query.  Is this normal?

Do to the nature of PostgreSQL's MVCC system, all Count(*) operations with no where clauses will
trigger a full table scan.  You could possible shave off a some time if you perform a VACUUM FULL
on this table.  Vacuum full will recover all the space from dead tuples.  The end result is that
fewer pages left will require less time to scan.

However, if you have a good auto-vacuum policy implemented you can get a very close estimate of
the number of records in you table in much less time.  This link has a good example of how this is
done:
http://www.commandprompt.com/blogs/joshua_drake/2007/08/how_many_rows_do_i_have_anyway/

Regards,
Richard Broersma Jr.

Re: Long count(*) time

От

Michael Cochez

Дата:

27 сентября 2007 г., 03:56:26

I'm a newbie in databases but maybe this is useful :
"Triggers can be defined to execute either before or after any INSERT, UPDATE, or DELETE operation, either once per modified row, or once per SQL statement. If a trigger event occurs, the trigger's function is called at the appropriate time to handle the event."
(from : http://www.postgresql.org/docs/8.1/interactive/triggers.html)
If you run this query a lot of times, it might be useful to write such a procedure that on every insert increases the number and on every delete decreases the number which you can store in a separate table. Of course usability depends on the number of insert/delete queries performed because the count query will go to milliseconds but every insert/delete will take (no idea how much) longer.

Michael

David Monarchi <david.e.monarchi@gmail.com> wrote:

Hello -

I'm running PG 8.2 on an 8-processor 16G Unix machine. The machine is dedicated to the db, and only 5 threads/processors are busy. The following query takes 70 seconds to execute.
select count(*) from url_list_url;
There are 64,219,173 rows in the table. The table consists of an integer field and a text field. The average length of the text field is 50 characters. There are btree indexes on both fields. The integer field is the key.

70 seconds seems to be a long time for this kind of query. Is this normal?

Thanks.

David

Luggage? GPS? Comic books?
Check out fitting gifts for grads at Yahoo! Search.

Re: Long count(*) time

От

Sean Davis

Дата:

27 сентября 2007 г., 08:54:33

If you vacuum regularly, you can parse the output of:

explain select count(*) from url_list_url;

The rowcount will be pretty close.  If you simply need "close", this is
an instantaneous query.

Sean


Michael Cochez wrote:
> I'm a newbie in databases but maybe this is useful :
> "Triggers can be defined to execute either before or after any INSERT,
> UPDATE, or DELETE operation, either once per modified row, or once per
> SQL statement. If a trigger event occurs, the trigger's function is
> called at the appropriate time to handle the event."
> (from : http://www.postgresql.org/docs/8.1/interactive/triggers.html)
> If you run this query a lot of times, it might be useful to write such a
> procedure that on every insert increases the number and on every delete
> decreases the number which you can store in a separate table. Of course
> usability depends on the number of insert/delete queries performed
> because the count query will go to milliseconds but every insert/delete
> will take (no idea how much) longer.
>
> Michael
>
>
>
> */David Monarchi <david.e.monarchi@gmail.com>/* wrote:
>
>     Hello -
>
>     I'm running PG 8.2 on an 8-processor 16G Unix machine.  The machine
>     is dedicated to the db, and only 5 threads/processors are busy.  The
>     following query takes 70 seconds to execute.
>          select count(*) from url_list_url;
>     There are 64,219,173 rows in the table.  The table consists of an
>     integer field and a text field.  The average length of the text
>     field is 50 characters.  There are btree indexes on both fields.
>     The integer field is the key.
>
>     70 seconds seems to be a long time for this kind of query.  Is this
>     normal?
>
>     Thanks.
>
>     David
>
>
> ------------------------------------------------------------------------
> Luggage? GPS? Comic books?
> Check out fitting gifts for grads
> <http://us.rd.yahoo.com/evt=48249/*http://search.yahoo.com/search?fr=oni_on_mail&p=graduation+gifts&cs=bz>
> at Yahoo! Search.

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Long count(*) time

Long count(*) time

Re: Long count(*) time

Re: Long count(*) time

Re: Long count(*) time

Re: Long count(*) time