Re: estimating # of distinct values

Поиск
Список
Период
Сортировка
От tv@fuzzy.cz
Тема Re: estimating # of distinct values
Дата
Msg-id 1d0cb03ef3ab1c04fe87d6910b51d243.squirrel@sq.gransy.com
обсуждение исходный текст
Ответ на Re: estimating # of distinct values  (Csaba Nagy <ncslists@googlemail.com>)
Ответы Re: estimating # of distinct values  (Jim Nasby <jim@nasby.net>)
Re: estimating # of distinct values  (Csaba Nagy <ncslists@googlemail.com>)
Список pgsql-hackers
> On Thu, 2010-12-30 at 21:02 -0500, Tom Lane wrote:
>> How is an incremental ANALYZE going to work at all?
>
> How about a kind of continuous analyze ?
>
> Instead of analyzing just once and then drop the intermediate results,
> keep them on disk for all tables and then piggyback the background
> writer (or have a dedicated process if that's not algorithmically
> feasible) and before writing out stuff update the statistics based on
> the values found in modified buffers. Probably it could take a random
> sample of buffers to minimize overhead, but if it is done by a
> background thread the overhead could be minimal anyway on multi-core
> systems.

Hi,

the problem is you will eventually need to drop the results and rebuild
it, as the algorithms do not handle deletes (ok, Florian mentioned an
algorithm L_0 described in one of the papers, but I'm not sure we can use
it).

I'm not sure a constantly running background process is a good idea. I'd
prefer storing an info about the modified tuples somewhere, and starting
analyze only when a given threshold is reached. I'm not sure how to do
that, though.

Another thing I'm not sure about is where to store those intermediate
stats (used to get the current estimate, updated incrementally). I was
thinking about pg_stats but I'm not sure it's the right place - depending
on the algorithm, this may be a fet kilobytes up to several megabytes. And
it's not needed except when updating it. Any ideas?

regards
Tomas



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Simon Riggs
Дата:
Сообщение: Re: system views for walsender activity
Следующее
От: Joachim Wieland
Дата:
Сообщение: Re: Snapshot synchronization, again...