Обсуждение: constant time count(*) ?
We're looking into moving some data from mysql to postgresql, and
notice that count(*) does not seem to be a constant-time function
as it seems to be in mysql.
planb=# explain select count(*) from assets;
QUERY PLAN
----------------------------------------------------------------
Aggregate (cost=22.50..22.50 rows=1 width=0)
-> Seq Scan on assets (cost=0.00..20.00 rows=1000 width=0)
(2 rows)
Is there a way to optimize count(*) such that it does not have
to do a sequential scan? We use this on some big tables and it
is slowing down processing quite a lot.
Thanks!
Mark
--
Mark Harrison
Pixar Animation Studios
Mark Harrison writes: > Is there a way to optimize count(*) such that it does not have > to do a sequential scan? No. If you need to count a lot, you need to store the information separately. -- Peter Eisentraut peter_e@gmx.net
This should definitely be a FAQ. The semantics of MVCC (multi-version concurrency control) means that you can't just store a number somewhere in the header of the table like some other database systems do. Try a count(*) on Oracle and you will see similar behavior. They use MVCC also. > -----Original Message----- > From: Mark Harrison [mailto:mh@pixar.com] > Sent: Wednesday, October 15, 2003 11:00 AM > To: pgsql-general@postgresql.org > Subject: [GENERAL] constant time count(*) ? > > > We're looking into moving some data from mysql to postgresql, > and notice that count(*) does not seem to be a constant-time > function as it seems to be in mysql. > > planb=# explain select count(*) from assets; > QUERY PLAN > ---------------------------------------------------------------- > Aggregate (cost=22.50..22.50 rows=1 width=0) > -> Seq Scan on assets (cost=0.00..20.00 rows=1000 > width=0) (2 rows) > > Is there a way to optimize count(*) such that it does not > have to do a sequential scan? We use this on some big tables > and it is slowing down processing quite a lot. > > Thanks! > Mark > > -- > Mark Harrison > Pixar Animation Studios > > > ---------------------------(end of > broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to > majordomo@postgresql.org) >
On Wed, Oct 15, 2003 at 11:00:10AM -0700, Mark Harrison wrote:
>
> Is there a way to optimize count(*) such that it does not have
> to do a sequential scan? We use this on some big tables and it
> is slowing down processing quite a lot.
No. There's a busload of discussion on this topic in the archives.
If you need an approximate value, you can get it from the system
tables.
A
--
----
Andrew Sullivan 204-4141 Yonge Street
Afilias Canada Toronto, Ontario Canada
<andrew@libertyrms.info> M2P 2A8
+1 416 646 3304 x110
Hi Mark, Mark Harrison wrote: > We're looking into moving some data from mysql to postgresql, and > notice that count(*) does not seem to be a constant-time function > as it seems to be in mysql. > > planb=# explain select count(*) from assets; > QUERY PLAN > ---------------------------------------------------------------- > Aggregate (cost=22.50..22.50 rows=1 width=0) > -> Seq Scan on assets (cost=0.00..20.00 rows=1000 width=0) > (2 rows) > > Is there a way to optimize count(*) such that it does not have > to do a sequential scan? We use this on some big tables and it > is slowing down processing quite a lot. How do you need an unqualified select count(*) on a table so often it is making a problem? Regards Tino