Обсуждение: Yet Another (Simple) Case of Index not used
Hi there, I'm running into a quite puzzling simple example where the index I've created on a fairly big table (465K entries) is not used, against all common sense expectations: The query I am trying to do (fast) is: select count(*) from addresses; This takes more than a second to complete, because, as the 'explain' command shows me, the index created on 'addresses' is not used, and a seq scan is being used. One would assume that the creation of an index would allow the counting of the number of entries in a table to be instantanous? Here are the details: * Using the latest postgresql 7.3.2 release, built and installed from sources on a Linux box, under Red Hat 8.0 * I have an 'addresses' table defined as: Columm | Type ------------------------------- address | text city | char var (20) zip | char var (5) state | char var (2) Unique keys: addresses_idx * I have created a unique index 'addresses_idx' on (address, city, zip, state): \d addresses_idx; Index "addresses_idx" Columm | Type ------------------------------- address | text city | char var (20) zip | char var (5) state | char var (2) unique btree * I did (re)create the index several times * I did run the vacuum analyse command several times * I forced enable_indexscan to true * I forced enable_seqscan to false Despite of all of this, each time I try: ===> explain select count(*) from addresses; I get the following: ===> NOTICE: QUERY PLAN: ===> ===> Aggregate (cost=100012799.89..100012799.89 rows=1 width=0) ===> -> Seq Scan on addresses (cost=100000000.00..100011635.11 rows=465911 width=0) Quite puzzling, isn't it? I've searched a bunch of mailing lists and websites, and found many reports of special cases where it could be argued that the planner may have had a case for choosing seq scanning over idx scanning, but unless I am missing some fundamental concept, there's something wrong here. Any suggestion anyone? Thanks, Denis denis@next2me.com
as I remember, mysql keeps the record count in a variable and is instantaneaous with that kind of query. Recent posts suggest the Postgres does not keep that variable and has to do the seq scan. Denis wrote: > Hi there, > I'm running into a quite puzzling simple example where the index I've > created on a fairly big table (465K entries) is not used, against all common > sense expectations: > The query I am trying to do (fast) is: > > select count(*) from addresses; > > This takes more than a second to complete, because, as the 'explain' command > shows me, > the index created on 'addresses' is not used, and a seq scan is being used. > One would assume that the creation of an index would allow the counting of > the number of entries in a table to be instantanous? > > Here are the details: > > * Using the latest postgresql 7.3.2 release, built and installed from > sources on a Linux box, under Red Hat 8.0 > > * I have an 'addresses' table defined as: > Columm | Type > ------------------------------- > address | text > city | char var (20) > zip | char var (5) > state | char var (2) > Unique keys: addresses_idx > > * I have created a unique index 'addresses_idx' on (address, city, zip, > state): > \d addresses_idx; > Index "addresses_idx" > Columm | Type > ------------------------------- > address | text > city | char var (20) > zip | char var (5) > state | char var (2) > unique btree > > * I did (re)create the index several times > * I did run the vacuum analyse command several times > * I forced enable_indexscan to true > * I forced enable_seqscan to false > > Despite of all of this, each time I try: > ===> explain select count(*) from addresses; > I get the following: > ===> NOTICE: QUERY PLAN: > ===> > ===> Aggregate (cost=100012799.89..100012799.89 rows=1 width=0) > ===> -> Seq Scan on addresses (cost=100000000.00..100011635.11 rows=465911 > width=0) > > Quite puzzling, isn't it? > I've searched a bunch of mailing lists and websites, and found many reports > of special cases where it could be argued that the planner may have had a > case for choosing seq scanning over idx scanning, but unless I am missing > some fundamental concept, there's something wrong here. > Any suggestion anyone? > Thanks, > > Denis > denis@next2me.com > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) >
> -----Original Message----- > From: Denis [mailto:denis@next2me.com] > Sent: Tuesday, April 08, 2003 12:57 PM > To: pgsql-performance@postgresql.org; > pgsql-general@postgresql.org; pgsql-sql@postgresql.org > Subject: [GENERAL] Yet Another (Simple) Case of Index not used > > > Hi there, > I'm running into a quite puzzling simple example where the > index I've created on a fairly big table (465K entries) is > not used, against all common sense expectations: The query I > am trying to do (fast) is: > > select count(*) from addresses; > > This takes more than a second to complete, because, as the > 'explain' command shows me, the index created on 'addresses' > is not used, and a seq scan is being used. As well it should be. > One would assume > that the creation of an index would allow the counting of the > number of entries in a table to be instantanous? Traversing the index to perform the count will definitely make the query many times slower. A general rule of thumb (not sure if it is true with PostgreSQL) is that if you have to traverse more than 10% of the data with an index then a full table scan will be faster. This is especially true when there is highly redundant data in the index fields. If there were an index on bit data type, and you have half and half 1 and 0, an index scan of the table will be disastrous. To simply scan the table, we will just sequentially read pages until the data is exhausted. If we follow the index, we will randomly jump from page to page, defeating the read buffering. [snip]
from mysql manual: ------------------------------------------------------------- "COUNT(*) is optimized to return very quickly if the SELECT retrieves from one table, no other columns are retrieved, and there is no WHERE clause. For example: mysql> select COUNT(*) from student;" ------------------------------------------------------------- A nice little optimization, maybe not possible in a MVCC system. Dann Corbit wrote: >>-----Original Message----- >>From: Denis [mailto:denis@next2me.com] >>Sent: Tuesday, April 08, 2003 12:57 PM >>To: pgsql-performance@postgresql.org; >>pgsql-general@postgresql.org; pgsql-sql@postgresql.org >>Subject: [GENERAL] Yet Another (Simple) Case of Index not used >> >> >>Hi there, >>I'm running into a quite puzzling simple example where the >>index I've created on a fairly big table (465K entries) is >>not used, against all common sense expectations: The query I >>am trying to do (fast) is: >> >>select count(*) from addresses; >> >>This takes more than a second to complete, because, as the >>'explain' command shows me, the index created on 'addresses' >>is not used, and a seq scan is being used. > > > As well it should be. > > >>One would assume >>that the creation of an index would allow the counting of the >>number of entries in a table to be instantanous? > > > Traversing the index to perform the count will definitely make the query > many times slower. > > A general rule of thumb (not sure if it is true with PostgreSQL) is that > if you have to traverse more than 10% of the data with an index then a > full table scan will be faster. This is especially true when there is > highly redundant data in the index fields. If there were an index on > bit data type, and you have half and half 1 and 0, an index scan of the > table will be disastrous. > > To simply scan the table, we will just sequentially read pages until the > data is exhausted. If we follow the index, we will randomly jump from > page to page, defeating the read buffering. > [snip] > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org >
On Tue, Apr 08, 2003 at 12:57:16PM -0700, Denis wrote: > The query I am trying to do (fast) is: > > select count(*) from addresses; > > This takes more than a second to complete, because, as the 'explain' command > shows me, > the index created on 'addresses' is not used, and a seq scan is being used. > One would assume that the creation of an index would allow the counting of > the number of entries in a table to be instantanous? Incorrect assumption. select count(*) can produce different results in different backends depending on the current state of the active transactions. -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > "the West won the world not by the superiority of its ideas or values or > religion but rather by its superiority in applying organized violence. > Westerners often forget this fact, non-Westerners never do." > - Samuel P. Huntington
Вложения
On Wed, 9 Apr 2003, Martijn van Oosterhout wrote: > On Tue, Apr 08, 2003 at 12:57:16PM -0700, Denis wrote: > > The query I am trying to do (fast) is: > > > > select count(*) from addresses; > > > > This takes more than a second to complete, because, as the 'explain' command > > shows me, > > the index created on 'addresses' is not used, and a seq scan is being used. > > One would assume that the creation of an index would allow the counting of > > the number of entries in a table to be instantanous? > > Incorrect assumption. select count(*) can produce different results in > different backends depending on the current state of the active > transactions. Some thoughts: Select count(*) is often applied to views, and may take some time depending on the underlying query. However, for a single table, I would have thought that if there are no write locks or open transactions for the table, the index would return a faster result than a scan? Is there room for some optimisation here? Does count(<primary_key>) work faster, poss using the unique index on the key (for non-composite keys)? Cheers Brent Wood
Dennis Gearon wrote: > from mysql manual: > ------------------------------------------------------------- > "COUNT(*) is optimized to return very quickly if the SELECT retrieves from one > table, no other columns are retrieved, and there is no WHERE clause. For example: > > mysql> select COUNT(*) from student;" > ------------------------------------------------------------- > > A nice little optimization, maybe not possible in a MVCC system. I think the only thing you can do with MVCC is to cache the value and tranaction id for "SELECT AGG(*) FROM tab" and make the cached value visible to transaction id's greater than the one that executed the query, and invalidate the cache every time the table is modified. In fact, don't clear the cache, just record the transaction id of the table modification command so we can use standard visibility routines to make the cache usable as long as possiible. The cleanest way would probably be to create an aggregate cache system table, and to insert into it when someone does an unqualified aggregate, and to delete from it when someone modifies the table --- the MVCC tuple visibility rules are handled automatically. Queries can look in there to see if a visible cached value already exists. Of course, the big question is whether this would be a big win, and whether the cost of upkeep would justify it. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
On Tuesday 15 Apr 2003 3:23 pm, Bruce Momjian wrote: > Dennis Gearon wrote: > > from mysql manual: > > ------------------------------------------------------------- > > "COUNT(*) is optimized to return very quickly if the SELECT retrieves > > from one table, no other columns are retrieved, and there is no WHERE > > clause. For example: > > > > mysql> select COUNT(*) from student;" > > ------------------------------------------------------------- > The cleanest way would probably be to create an aggregate cache system > table, and to insert into it when someone does an unqualified aggregate, > and to delete from it when someone modifies the table --- the MVCC tuple > visibility rules are handled automatically. Queries can look in there > to see if a visible cached value already exists. Of course, the big > question is whether this would be a big win, and whether the cost of > upkeep would justify it. If the rule system could handle something like: CREATE RULE quick_foo_count AS ON SELECT count(*) FROM foo DO INSTEAD SELECT quick_count FROM agg_cache WHERE tbl_name='foo'; The whole thing could be handled by user-space triggers/rules and still invisible to the end-user. -- Richard Huxton
Added to TODO: * Consider using MVCC to cache count(*) queries with no WHERE clause --------------------------------------------------------------------------- Bruce Momjian wrote: > Dennis Gearon wrote: > > from mysql manual: > > ------------------------------------------------------------- > > "COUNT(*) is optimized to return very quickly if the SELECT retrieves from one > > table, no other columns are retrieved, and there is no WHERE clause. For example: > > > > mysql> select COUNT(*) from student;" > > ------------------------------------------------------------- > > > > A nice little optimization, maybe not possible in a MVCC system. > > I think the only thing you can do with MVCC is to cache the value and > tranaction id for "SELECT AGG(*) FROM tab" and make the cached value > visible to transaction id's greater than the one that executed the > query, and invalidate the cache every time the table is modified. > > In fact, don't clear the cache, just record the transaction id of the > table modification command so we can use standard visibility routines to > make the cache usable as long as possiible. > > The cleanest way would probably be to create an aggregate cache system > table, and to insert into it when someone does an unqualified aggregate, > and to delete from it when someone modifies the table --- the MVCC tuple > visibility rules are handled automatically. Queries can look in there > to see if a visible cached value already exists. Of course, the big > question is whether this would be a big win, and whether the cost of > upkeep would justify it. > > -- > Bruce Momjian | http://candle.pha.pa.us > pgman@candle.pha.pa.us | (610) 359-1001 > + If your life is a hard drive, | 13 Roberts Road > + Christ can be your backup. | Newtown Square, Pennsylvania 19073 > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073