Re: Why does a simple query not use an obvious index?

Поиск
Список
Период
Сортировка
От Scott Marlowe
Тема Re: Why does a simple query not use an obvious index?
Дата
Msg-id 1093815480.5493.29.camel@localhost.localdomain
обсуждение исходный текст
Ответ на Re: Why does a simple query not use an obvious index?  (Greg Stark <gsstark@mit.edu>)
Ответы Re: Why does a simple query not use an obvious index?  ("Scott Marlowe" <smarlowe@qwest.net>)
Re: Why does a simple query not use an obvious index?  ("Steinar H. Gunderson" <sgunderson@bigfoot.com>)
Re: Why does a simple query not use an obvious index?  (Greg Stark <gsstark@mit.edu>)
Список pgsql-performance
On Sun, 2004-08-29 at 15:12, Greg Stark wrote:
> "Scott Marlowe" <smarlowe@qwest.net> writes:
>
> > Also, count(*) is likely to always generate a seq scan due to the way
> > aggregates are implemented currently in pgsql.  you might want to try:
>
> Huh? I'm curious to know what you're talking about here.

This has been discussed ad infinitum on the lists in the past.  And
explained by better minds than mine, but I'll give it a go.

PostgreSQL has a "generic" aggregate method.  Imagine instead doing a
select count(id1+id2-id3) from table where ...  In that instance, it's
not a simple shortcut to just grab the number of rows anymore.  Since
PostgreSQL uses a generic aggregate method that can be expanded by the
user with custom aggregates et. al., it has no optimizations to make
simple count(*) fast, like many other databases.

Add to that the fact that even when postgresql uses an index it still
has to hit the data store to get the actual value of the tuple, and
you've got very few instances in which an index scan of more than some
small percentage of the table is worth while.  I.e. a sequential scan
tends to "win" over an index scan quicker in postgresql than in other
databases like Oracle, where the data store is serialized and the
indexes have the correct information along with the application of the
transaction / roll back segments.

> > select somefield from sometable where timestampfield > now()-'60
> > seconds'::interval
> >
> > and count the number of returned rows.  If there's a lot, it won't be
> > any faster, if there's a few, it should be a win.
>
> Why would this ever be faster? And how could postgres ever calculate that
> without doing a sequential scan when count(*) would force it to do a
> sequential scan?

Because, count(*) CANNOT use an index.  So, if you're hitting, say,
0.01% of the table (let's say 20 out of 20,000,000 rows or something
like that) then the second should be MUCH faster.


В списке pgsql-performance по дате отправления:

Предыдущее
От: Greg Stark
Дата:
Сообщение: Re: Why does a simple query not use an obvious index?
Следующее
От: "Scott Marlowe"
Дата:
Сообщение: Re: Why does a simple query not use an obvious index?