Re: Make COUNT(*) Faster?

Поиск
Список
Период
Сортировка
От Steve Wampler
Тема Re: Make COUNT(*) Faster?
Дата
Msg-id 42CE96AF.3020903@noao.edu
обсуждение исходный текст
Ответ на Re: Make COUNT(*) Faster?  (Bruno Wolff III <bruno@wolff.to>)
Ответы Re: Make COUNT(*) Faster?  (Rod Taylor <pg@rbt.ca>)
Re: Make COUNT(*) Faster?  (Andrew Sullivan <ajs@crankycanuck.ca>)
Re: Make COUNT(*) Faster?  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-sql
Bruno Wolff III wrote:
> No, it is an exact count.

Yes, for the transaction, but it's an approximation of the number of
tuples in the table - which is probably what the people who worry about
its cost are more interested in (an approximate count for the table).
I'm also claiming that a true count for any active table is meaningless and
am *not* suggesting that effort be spent on trying to produce such
a true count.

>>I assume this has been beaten well past death, but I don't see why it
>>wouldn't be possible to keep pg_class.reltuples a bit more up-to-date
>>instead of updating it only on vacuums.
> 
> 
> Because it costs resources to keep track of that and people don't usually need
> exact tuple counts for whole tables. 

Yes, we agree completely! (Which is why I said 'a bit more' instead of
'exactly' above.)  My uses for COUNT(*) are to get 'reasonable' approximate
counts of the table sizes - not true counts, but approximate values.  Unfortunately,
pg_class.reltuples gets too far off too fast for me to use it as a consistent guide
to current table size.  If you Folks Who Know believe that simply keeping
pg_class.reltuples 'closer' to the actual table size is too expensive, I'll
accept that [after all, I have to right now anyway], but I'm surprised that
it is, given all the other work that must go on at the start/close of a transaction.

I also understand that 'reasonable' and 'closer' are vague terms.
In the example scenerio where there were around 80 rows in an indeterminate
state, my claim is that, in a table of around a million rows, it doesn't
matter whether some portion of those indeterminate rows are included in
an approximation of the table size or not (though it might in a table of
100 'true' rows - but the decision to ask for a true 'transaction' count (slow)
or an approximate table size (fast) should be left to the user in either case).

So, leave COUNT(*) alone.  But it would be very handy to have a
way to get an approximate table size that is more accurate than is
provided by a pg_class.reltuples that is only updated on vacuums.

-- 
Steve Wampler -- swampler@noao.edu
The gods that smiled on your birth are now laughing out loud.


В списке pgsql-sql по дате отправления:

Предыдущее
От: PFC
Дата:
Сообщение: Re: two sums in one query
Следующее
От: Rod Taylor
Дата:
Сообщение: Re: Make COUNT(*) Faster?