Обсуждение: Long count(*) time
Hello -
I'm running PG 8.2 on an 8-processor 16G Unix machine. The machine is dedicated to the db, and only 5 threads/processors are busy. The following query takes 70 seconds to execute.
select count(*) from url_list_url;
There are 64,219,173 rows in the table. The table consists of an integer field and a text field. The average length of the text field is 50 characters. There are btree indexes on both fields. The integer field is the key.
70 seconds seems to be a long time for this kind of query. Is this normal?
Thanks.
David
I'm running PG 8.2 on an 8-processor 16G Unix machine. The machine is dedicated to the db, and only 5 threads/processors are busy. The following query takes 70 seconds to execute.
select count(*) from url_list_url;
There are 64,219,173 rows in the table. The table consists of an integer field and a text field. The average length of the text field is 50 characters. There are btree indexes on both fields. The integer field is the key.
70 seconds seems to be a long time for this kind of query. Is this normal?
Thanks.
David
David Monarchi wrote: > Hello - > > I'm running PG 8.2 on an 8-processor 16G Unix machine. The machine is > dedicated to the db, and only 5 threads/processors are busy. The > following query takes 70 seconds to execute. > select count(*) from url_list_url; > There are 64,219,173 rows in the table. The table consists of an > integer field and a text field. The average length of the text field is > 50 characters. There are btree indexes on both fields. The integer > field is the key. > > 70 seconds seems to be a long time for this kind of query. Is this normal? > > Thanks. > > David Unless I misplaced a decimal point, that works out to a tad over 52MBytes/sec which would not be unreasonable for a lower-end disk subsystem (the details of which you didn't mention at all). Now, if you're simply wondering "Why does count(*) take so long?" you may want to search the list archives. It's a question that's been asked and answered many times and a search on something like "slow count" will provide hours of reading material. The very short answer: An unrestricted count(*) must, by the nature of the current MVCC implementation used by PostgreSQL, read the entire table. An index cannot be used (well, it can, but using it would slow things down even further). -Jon -- Senior Systems Developer Media Matters for America http://mediamatters.org/
--- David Monarchi <david.e.monarchi@gmail.com> wrote: > 70 seconds seems to be a long time for this kind of query. Is this normal? Do to the nature of PostgreSQL's MVCC system, all Count(*) operations with no where clauses will trigger a full table scan. You could possible shave off a some time if you perform a VACUUM FULL on this table. Vacuum full will recover all the space from dead tuples. The end result is that fewer pages left will require less time to scan. However, if you have a good auto-vacuum policy implemented you can get a very close estimate of the number of records in you table in much less time. This link has a good example of how this is done: http://www.commandprompt.com/blogs/joshua_drake/2007/08/how_many_rows_do_i_have_anyway/ Regards, Richard Broersma Jr.
I'm a newbie in databases but maybe this is useful :
"Triggers can be defined to execute either before or after any INSERT, UPDATE, or DELETE operation, either once per modified row, or once per SQL statement. If a trigger event occurs, the trigger's function is called at the appropriate time to handle the event."
(from : http://www.postgresql.org/docs/8.1/interactive/triggers.html)
If you run this query a lot of times, it might be useful to write such a procedure that on every insert increases the number and on every delete decreases the number which you can store in a separate table. Of course usability depends on the number of insert/delete queries performed because the count query will go to milliseconds but every insert/delete will take (no idea how much) longer.
Michael
David Monarchi <david.e.monarchi@gmail.com> wrote:
Luggage? GPS? Comic books?
Check out fitting gifts for grads at Yahoo! Search.
"Triggers can be defined to execute either before or after any INSERT, UPDATE, or DELETE operation, either once per modified row, or once per SQL statement. If a trigger event occurs, the trigger's function is called at the appropriate time to handle the event."
(from : http://www.postgresql.org/docs/8.1/interactive/triggers.html)
If you run this query a lot of times, it might be useful to write such a procedure that on every insert increases the number and on every delete decreases the number which you can store in a separate table. Of course usability depends on the number of insert/delete queries performed because the count query will go to milliseconds but every insert/delete will take (no idea how much) longer.
Michael
David Monarchi <david.e.monarchi@gmail.com> wrote:
Hello -
I'm running PG 8.2 on an 8-processor 16G Unix machine. The machine is dedicated to the db, and only 5 threads/processors are busy. The following query takes 70 seconds to execute.
select count(*) from url_list_url;
There are 64,219,173 rows in the table. The table consists of an integer field and a text field. The average length of the text field is 50 characters. There are btree indexes on both fields. The integer field is the key.
70 seconds seems to be a long time for this kind of query. Is this normal?
Thanks.
David
Luggage? GPS? Comic books?
Check out fitting gifts for grads at Yahoo! Search.
If you vacuum regularly, you can parse the output of: explain select count(*) from url_list_url; The rowcount will be pretty close. If you simply need "close", this is an instantaneous query. Sean Michael Cochez wrote: > I'm a newbie in databases but maybe this is useful : > "Triggers can be defined to execute either before or after any INSERT, > UPDATE, or DELETE operation, either once per modified row, or once per > SQL statement. If a trigger event occurs, the trigger's function is > called at the appropriate time to handle the event." > (from : http://www.postgresql.org/docs/8.1/interactive/triggers.html) > If you run this query a lot of times, it might be useful to write such a > procedure that on every insert increases the number and on every delete > decreases the number which you can store in a separate table. Of course > usability depends on the number of insert/delete queries performed > because the count query will go to milliseconds but every insert/delete > will take (no idea how much) longer. > > Michael > > > > */David Monarchi <david.e.monarchi@gmail.com>/* wrote: > > Hello - > > I'm running PG 8.2 on an 8-processor 16G Unix machine. The machine > is dedicated to the db, and only 5 threads/processors are busy. The > following query takes 70 seconds to execute. > select count(*) from url_list_url; > There are 64,219,173 rows in the table. The table consists of an > integer field and a text field. The average length of the text > field is 50 characters. There are btree indexes on both fields. > The integer field is the key. > > 70 seconds seems to be a long time for this kind of query. Is this > normal? > > Thanks. > > David > > > ------------------------------------------------------------------------ > Luggage? GPS? Comic books? > Check out fitting gifts for grads > <http://us.rd.yahoo.com/evt=48249/*http://search.yahoo.com/search?fr=oni_on_mail&p=graduation+gifts&cs=bz> > at Yahoo! Search.