Re: Much Ado About COUNT(*)

Поиск
Список
Период
Сортировка
От Jeff Davis
Тема Re: Much Ado About COUNT(*)
Дата
Msg-id 1106170367.2886.623.camel@jeff
обсуждение исходный текст
Ответ на Re: Much Ado About COUNT(*)  (Alvaro Herrera <alvherre@dcc.uchile.cl>)
Ответы Re: Much Ado About COUNT(*)  ("Mark Cave-Ayland" <m.cave-ayland@webbased.co.uk>)
Список pgsql-hackers
To fill in some details I think what he's saying is this:

=> create table foo(...);
=> create table foo_count(num int);
=> insert into foo_count values(0);
=> create table foo_change(num int);

then create a trigger "after delete on foo" that does "insert into
foo_change values(-1)" and a trigger "after insert on foo" that inserts
a +1 into foo_change.

Periodically, do:
=> begin;
=> set transaction isolation level serializable;
=> update foo_count set num=num+(select sum(num) from foo_change);
=> delete from foo_change;
=> commit;
=> VACUUM;

And then any time you need the correct count(*) value, do instead:
=> select sum(num) from (select num from foo_count union select num from
foo_change);

And that should work. I haven't tested this exact example, so I may have
overlooked something.

Hope that helps. That way, you don't have huge waste from the second
table, and also triggers maintain it for you and you don't need to think
about it.

Regards,Jeff Davis

On Wed, 2005-01-19 at 17:40 -0300, Alvaro Herrera wrote:
> On Wed, Jan 19, 2005 at 10:16:38AM -0600, Bruno Wolff III wrote:
> > On Wed, Jan 19, 2005 at 14:59:17 -0000,
> >   Mark Cave-Ayland <m.cave-ayland@webbased.co.uk> wrote:
> >
> > > So then I would use SELECT COUNT(*) FROM person_count whenever I wanted to
> > > know the current number of person records. How much quicker would a COUNT(*)
> > > be if visibility were included in the indices as opposed to a "hacked"
> > > approach like this?
> > 
> > You are only going to get a constant factor speed up unless the space savings
> > allows much better use of cache. You probably want to look at using
> > triggers to maintain counts in another table.
> 
> I'd try using a "start value" and a differences list.  So the
> differences list would be initially empty and the start value would be
> 0.  On insert or delete, you create a new difference (with +1 or
> whatever).  Periodically, the differences would be added to the start
> value and the records deleted.  Thus the time to calculate the total
> count is much smaller, and it follows MVCC rules.  Of course there are
> lots of minor details not mentioned here.
> 
> Not sure if I'd model this with a single table or two.
> 



В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Marc G. Fournier"
Дата:
Сообщение: Re: Two-phase commit for 8.1
Следующее
От: Neil Conway
Дата:
Сообщение: Re: Caching of frequently used objects