Re: [HACKERS] Re: Top N queries and disbursion

Поиск
Список
Период
Сортировка
От Roberto Cornacchia
Тема Re: [HACKERS] Re: Top N queries and disbursion
Дата
Msg-id 37FDEF32.DDE759AC@tin.it
обсуждение исходный текст
Ответ на Re: [HACKERS] Re: Top N queries and disbursion  (Bruce Momjian <maillist@candle.pha.pa.us>)
Ответы Re: [HACKERS] Re: Top N queries and disbursion
Re: [HACKERS] Re: Top N queries and disbursion
Список pgsql-hackers
Bruce Momjian wrote:
> 
> > No, it's certainly not the right thing.  To my understanding, disbursion
> > is a measure of the frequency of the most common value of an attribute;
> > but that tells you very little about how many other values there are.
> > 1/disbursion is a lower bound on the number of values, but it wouldn't
> > be a good estimate unless you had reason to think that the values were
> > pretty evenly distributed.  There could be a *lot* of very-infrequent
> > values.
> >
> > > with 100 distinct values of an attribute uniformly distribuited in a
> > > relation of 10000 tuples, disbursion was estimated as 0.002275, giving
> > > us 440 distinct values.
> >
> > This is an illustration of the fact that Postgres' disbursion-estimator
> > is pretty bad :-(.  It usually underestimates the frequency of the most
> > common value, unless the most common value is really frequent
> > (probability > 0.2 or so).  I've been trying to think of a more accurate
> > way of figuring the statistic that wouldn't be unreasonably slow.
> > Or, perhaps, we should forget all about disbursion and adopt some other
> > statistic(s).
> 
> Yes, you have the crux of the issue.  I wrote it because it was the best
> thing I could think of, but it is non-optimimal.  Because all the
> optimal solutions seemed too slow to me, I couldn't think of a better
> one.

Thank you, Tom and Bruce.
This is not a good news for us :-(. In any case, is 1/disbursion the
best estimate we can have by now, even if not optimal?

Roberto Cornacchia
Andrea Ghidini



В списке pgsql-hackers по дате отправления:

Предыдущее
От: wieck@debis.com (Jan Wieck)
Дата:
Сообщение: RI status report #4 (come and join)
Следующее
От: "Ansley, Michael"
Дата:
Сообщение: RE: [HACKERS] Re: Top N queries and disbursion