AW: analyze.c

Поиск
Список
Период
Сортировка
От Zeugswetter Andreas SB
Тема AW: analyze.c
Дата
Msg-id 11C1E6749A55D411A9670001FA6879633680A9@sdexcsrv1.f000.d0188.sd.spardat.at
обсуждение исходный текст
Список pgsql-hackers
> > I've been reading something about implementation of histograms, and,
> > AFAIK, in practice histograms is just a cool name for no more than:
> >    1. top ten with frequency for each
> >    2. the same for top ten worse
> >    3. average for the rest

Consider, that we only need that info for choice of index, and if an average value was too
frequent for this index to be efficient you can safely drop the index, it would be useless.
Thus it seems to me that keeping stats on the most infrequent values (point 2) is useless.
For me these would also be the most volatile, thus the stats would only be
accurate for a short period of time.

I think what we need is as follows:
1. our current histograms 
2. a list of exceptions for exceptional values that are very frequent
Exceptional are those values that would skew the distribution too much.

Very infrequent values should not be used for min|max values of histogram buckets,
but that is imho all that needs to be done for infrequent values.

Andreas


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Franck Martin
Дата:
Сообщение: bytea type
Следующее
От: The Hermit Hacker
Дата:
Сообщение: Re: AW: Backup, restore & pg_dump