Re: Seqscan in MAX(index_column)

Поиск
Список
Период
Сортировка
От Bruce Momjian
Тема Re: Seqscan in MAX(index_column)
Дата
Msg-id 200309050202.h8522jr06156@candle.pha.pa.us
обсуждение исходный текст
Ответ на Re: Seqscan in MAX(index_column)  (Christopher Browne <cbbrowne@acm.org>)
Ответы Re: Seqscan in MAX(index_column)  (Neil Conway <neilc@samurai.com>)
Список pgsql-hackers
Christopher Browne wrote:
> > IMHO portability is an important point. People are used to MAX() and
> > COUNT(*), and will be surprised that they need some special
> > treatment. While the reasons for this are perfectly explainable,
> > speeding up these aggregates with some extra effort would make porting
> > a bit easier.
> 
> The availability of cleverness with MAX()/MIN() is no grand surprise;
> it would be very nice to get some expansion of that to "SELECT VALUE
> FROM TABLE WHERE (CRITERIA) ORDER BY VALUE DESCENDING LIMIT 1;"
> 
> But I'm _very_ curious as to what the anticipated treatment to collect
> COUNT() more efficiently would be.  I would expect that it would only
> be able to get tuned much more if there's NO "where" clause, so that
> it could use some ("magically-kept-up-to-date") stats on table size.
> 
> I don't see any way to optimize COUNT when numbers of rows can
> continually vary.  Storing stats somewhere will just make updates more
> expensive.  And if those stats are for the table, that doesn't help me
> if I want "COUNT(*) FROM TABLE WHERE UPDATED_ON BETWEEN NOW() - '1
> day' and NOW()".

Yes, count would only use the cached stats for non-WHERE clause
COUNT(*).

My idea is that if a transaction doing a COUNT(*) would first look to
see if there already was a visible cached value, and if not, it would do
the COUNT(*) and insert into the cache table.  Any INSERT/DELETE would
remove the value from the cache.  As I see it, the commit of the
INSERT/DELETE transaction would then auto-invalidate the cache at the
exact time the transaction commits.  This would allow MVCC visibility of
the counts.

A trickier idea would be for INSERT/DELETE to UPDATE the cached value. 
It might be possible to always have a valid cache value for COUNT(*).
(COPY would also need to update the cache.)

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Christopher Browne
Дата:
Сообщение: Re: Seqscan in MAX(index_column)
Следующее
От: Neil Conway
Дата:
Сообщение: Re: Seqscan in MAX(index_column)