Re: no index-usage on aggregate-functions?
От | Scott Marlowe |
---|---|
Тема | Re: no index-usage on aggregate-functions? |
Дата | |
Msg-id | 1088493903.12350.5.camel@localhost.localdomain обсуждение исходный текст |
Ответ на | no index-usage on aggregate-functions? ("Harald Lau (Sector-X)" <harald@sector-x.de>) |
Список | pgsql-performance |
On Tue, 2004-06-29 at 00:42, Harald Lau (Sector-X) wrote: > Hi, > > I've experienced that PG up to current release does not make use of an index when aggregating. Which of course may resultin unacceptable answering times > > This behaviour is reproducable on any table with any aggregat function in all of my databases on every machine (PostgreSQL7.4.2 on i386-redhat-linux-gnu and PostgreSQL 7.2.1 on i686-pc-linux-gnu) > > f.e. querying against a 2.8-mio-records (2.800.000) table the_table > SELECT count(*) FROM the_table > => Seq scan -> takes about 12 sec > > SELECT Avg(num_found) AS NumFound FROM the_table --(index on num_found) > => Seq scan -> takes about 10 sec > > SELECT Sum(num_found) AS TotalFound FROM the_table --(index on num_found) > => Seq scan -> takes about 11 sec > > SELECT Max(date_) AS LatestDate FROM the_table --(index on date_) > => Seq scan -> takes about 14 sec > > But > SELECT date_ AS LatestDate FROM the_table ORDER BY date_ DESC LIMIT 1; > => Index scan -> takes 0.18 msec > > MS SQLServer 2000: Use of an appropriate index _whenever_ aggregating. > > Am I doing something wrong? Yes, you're expecting an MVCC database to behave like a row locking database. Due to the way PostgreSQL is put together, it can't count on an index giving it values, only pointers to values, so to speak. This means it can use an index, but it will still go to the table to get the right value. On the other hand, the trade off is that MVCC can handle much higher parallel loads, usually. Note that if you're aggregate is on sub subset of a table, then an index scan can often be a big win, such as: create table z(a int, b int); insert into z values (1,1); (repeat a couple thousand times) select avg(b) from z where a=3; <-- this can use the index But note that in the above, the table's rows will still have to be accessed to get the right values.
В списке pgsql-performance по дате отправления: