Re: count * performance issue

Поиск
Список
Период
Сортировка
От Joe Mirabal
Тема Re: count * performance issue
Дата
Msg-id 12ba3bf00803101354w15b023e1o237f564a1b01105f@mail.gmail.com
обсуждение исходный текст
Ответ на Re: count * performance issue  (Gregory Stark <stark@enterprisedb.com>)
Ответы Re: count * performance issue  (Bill Moran <wmoran@collaborativefusion.com>)
Re: count * performance issue  ("Scott Marlowe" <scott.marlowe@gmail.com>)
Re: count * performance issue  (Greg Smith <gsmith@gregsmith.com>)
Список pgsql-performance
Gregory,

I just joined this listserv and was happy to see this posting.  I have a 400GB table that I have indexed (building the index took 27 hours) , Loading the table with 10 threads took 9 hours.  I run queries on the data nad get immediate max and min as well as other aggrgate functions very quickly, however a select count(*) of the table takes forever usually nearly an hour or more. 

Do you have any tuning recommendations.  We in our warehouse use the count(*) as our verification of counts by day/month's etc and in Netezza its immediate.  I tried by adding oids. BUT the situation I learned was that adding the oids in the table adds a significasnt amount of space to the data AND the index.

As you may gather from this we are relatively new on Postgres.

Any suggestions you can give me would be most helpful.

Cheers,
Joe

On Mon, Mar 10, 2008 at 11:16 AM, Gregory Stark <stark@enterprisedb.com> wrote:
"Tom Lane" <tgl@sss.pgh.pa.us> writes:

> Well, scanning an index to get a count might be significantly faster
> than scanning the main table, but it's hardly "instantaneous".  It's
> still going to take time proportional to the table size.

Hm, Mark's comment about bitmap indexes makes that not entirely true. A bitmap
index can do RLE compression which makes the relationship between the size of
the table and the time taken to scan the index more complex. In the degenerate
case where there are no concurrent updates (assuming you can determine that
quickly) it might actually be constant time.

> Unless they keep a central counter of the number of index entries;
> which would have all the same serialization penalties we've talked
> about before...

Bitmap indexes do in fact have concurrency issues -- arguably they're just a
baroque version of this central counter in this case.

--
 Gregory Stark
 EnterpriseDB          http://www.enterprisedb.com
 Ask me about EnterpriseDB's Slony Replication support!

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



--
Mirabili et Veritas
Joe Mirabal

В списке pgsql-performance по дате отправления:

Предыдущее
От: Rainer Pruy
Дата:
Сообщение: Re: Very slow (2 tuples/second) sequential scan after bulk insert; speed returns to ~500 tuples/second after commit
Следующее
От: Bill Moran
Дата:
Сообщение: Re: count * performance issue