Re: Performance of count(*)

Поиск
Список
Период
Сортировка
От Andreas Kostyrka
Тема Re: Performance of count(*)
Дата
Msg-id 20070322122944.GD4439@andi-lap.la.revver.com
обсуждение исходный текст
Ответ на Re: Performance of count(*)  (Andreas Tille <tillea@rki.de>)
Ответы Re: Performance of count(*)  (Michael Fuhr <mike@fuhr.org>)
Re: Performance of count(*)  (Carlos Moreno <moreno_pg@mochima.com>)
Список pgsql-performance
* Andreas Tille <tillea@rki.de> [070322 13:24]:
> On Thu, 22 Mar 2007, Andreas Kostyrka wrote:
>
> >Which version of PG?
>
> Ahh, sorry, forgot that.  The issue occurs in Debian (Etch) packaged
> version 7.4.16.  I plan to switch soon to 8.1.8.
I'd recommend 8.2 if at all possible :)
>
> >That's the reason why PG (check the newest releases, I seem to
> >remember that there has been some aggregate optimizations there),
>
> I'll verify this once I moved to the new version.
8.1 won't help you I'd guess. ;)

>
> >does
> >a SeqScan for select count(*) from table. btw, depending upon your
> >data, doing a select count(*) from table where user=X will use an
> >Index, but will still need to fetch the rows proper to validate them.
>
> I have an index on three (out of 7 columns) of this table.  Is there
> any chance to optimize indexing regarding this.
Well, that depends upon you query pattern. It's an art and a science
at the same time ;)
>
> Well, to be honest I'm not really interested in the performance of
> count(*).  I was just discussing general performance issues on the
> phone line and when my colleague asked me about the size of the
> database he just wonderd why this takes so long for a job his
> MS-SQL server is much faster.  So in principle I was just asking
> a first question that is easy to ask.  Perhaps I come up with
> more difficult optimisation questions.

Simple. MSSQL is optimized for this case, and uses "older"
datastructures. PG uses a MVCC storage, which is not optimized for
this usecase. It's quite fast for different kinds of queries.

The basic trouble here is that mvcc makes it a little harder to decide
what is valid for your transaction, plus the indexes seems to be
designed for lookup, not for data fetching. (Basically, PG can use
indexes only to locate potential data, but cannot return data directly
out of an index)

Andreas

В списке pgsql-performance по дате отправления:

Предыдущее
От: David Brain
Дата:
Сообщение: Potential memory usage issue
Следующее
От: Bill Moran
Дата:
Сообщение: Re: Performance of count(*)