Re: Proposal - improve eqsel estimates by including histogram bucket numdistinct statistics

Поиск

Список

Период

Сортировка

От	Tom Lane
Тема	Re: Proposal - improve eqsel estimates by including histogram bucket numdistinct statistics
Дата	10 июня 2008 г. 23:03:58
Msg-id	26169.1213139023@sss.pgh.pa.us обсуждение исходный текст
Ответ на	Re: Proposal - improve eqsel estimates by including histogram bucket numdistinct statistics ("Nathan Boley" <npboley@gmail.com>)
Ответы	Re: Runtime checking of MCV (Was: ... histogram bucket numdistinct statistics) Re: Proposal - improve eqsel estimates by including histogram bucket numdistinct statistics
Список	pgsql-hackers

Дерево обсуждения

"Nathan Boley" <npboley@gmail.com> writes:
>> (1) On what grounds do you assert the above?

> For a table with 1000000 non-mcv rows, the planner estimates a result
> set of cardinality 1000000/20 = 50000, not 1.

The real problem in that situation is that you need another twenty slots
in the MCV list.  The MCV list should *always* exhaust the set of values
for which it'd be bad to do an indexscan.  Assuming that the threshold
for switching to an indexscan is somewhere around selectivity 0.005
(I am not certain offhand, but it's in that general area), this cannot
possibly require more than 200 MCV slots, and for most data
distributions it'd be a whole lot less.

Given such an MCV list, the planner will always make the right choice
of whether to do index or seqscan ... as long as it knows the value
being searched for, that is.  Parameterized plans have a hard time here,
but that's not really the fault of the statistics.

> I see where the confusion is coming from. When I said
> What if ( somehow ) we knew that there was a 90%
> chance that query would return an empty result set
> I meant that the planner doesn't know that information. And how could it?

Hmm.  IIRC the estimates are set up on the assumption that you are
searching for a value that occurs in the table.  I suppose there are
applications where that's often false, but as you say, it's hard to know
that in advance.
        regards, tom lane

В списке pgsql-hackers по дате отправления:

Предыдущее

От: "Nathan Boley"
Дата: 10 июня 2008 г., 21:31:20
Сообщение: Re: Proposal - improve eqsel estimates by including histogram bucket numdistinct statistics

Следующее

От: Gregory Stark
Дата: 10 июня 2008 г., 23:19:48
Сообщение: Re: Proposal - improve eqsel estimates by including histogram bucket numdistinct statistics

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Proposal - improve eqsel estimates by including histogram bucket numdistinct statistics

Предыдущее

Следующее