Re: Statistics and selectivity estimation for ranges

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: Statistics and selectivity estimation for ranges
Дата
Msg-id 50338C2E.5060908@enterprisedb.com
обсуждение исходный текст
Ответ на Re: Statistics and selectivity estimation for ranges  (Alexander Korotkov <aekorotkov@gmail.com>)
Список pgsql-hackers
On 20.08.2012 00:31, Alexander Korotkov wrote:
> On Thu, Aug 16, 2012 at 4:40 PM, Heikki Linnakangas<
> heikki.linnakangas@enterprisedb.com>  wrote:
>
>> On 15.08.2012 11:34, Alexander Korotkov wrote:
>>
>>> Ok, we've to decide if we need "standard" histogram. In some cases it can
>>> be used for more accurate estimation of<   and>   operators.
>>> But I think it is not so important. So, we can replace "standard"
>>> histogram
>>> with histograms of lower and upper bounds?
>>
>> Yeah, I think that makes more sense. The lower bound histogram is still
>> useful for<  and>  operators, just not as accurate if there are lots of
>> values with the same lower bound but different upper bound.
>
> New version of patch.
> * Collect new stakind STATISTIC_KIND_BOUNDS_HISTOGRAM, which is lower and
> upper bounds histograms combined into single ranges array, instead
> of STATISTIC_KIND_HISTOGRAM.

Ah, that's an interesting approach. So essentially, the histogram looks 
just like a normal STATISTIC_KIND_HISTOGRAM histogram, but the values 
stored in it are not picked the usual way. The usual way would be to 
pick N evenly-spaced values from the column, and store those. Instead, 
you pick N evenly-spaced lower bounds, and N evenly-spaced upper bounds, 
and construct N range values from those. Looking at a single value in 
the histogram, its lower bound comes from a different row than its upper 
bound.

That's pretty clever - the histogram has a shape and order that's 
compatible with a histogram you'd get with the standard scalar 
typanalyze function. In fact, I think you could just let the standard 
scalar estimators for < and > to use that histogram as is. Perhaps we 
should use STATISTIC_KIND_HISTOGRAM for this after all...

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: multi-master pgbench?
Следующее
От: Sachin Srivastava
Дата:
Сообщение: PostgreSQL 9.2beta4 (& git HEAD) server crash on creating extension plpython3u