Re: Should the function get_variable_numdistinct consider the case when stanullfrac is 1.0?

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Should the function get_variable_numdistinct consider the case when stanullfrac is 1.0?
Дата
Msg-id 149287.1604106275@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Should the function get_variable_numdistinct consider the case when stanullfrac is 1.0?  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
I wrote:
> * It's not apparent why, if ANALYZE's sample is all nulls, we wouldn't
> conclude stadistinct = 0 and thus arrive at the desired answer that
> way.  (Since we have a complaint, I'm guessing that ANALYZE might
> disbelieve its own result and stick in some larger stadistinct.  But
> then maybe that's where to fix this, not here.)

Oh, on second thought (and with some testing): ANALYZE *does* report
stadistinct = 0.  The real issue is that get_variable_numdistinct is
assuming it can use that value as meaning "stadistinct is unknown".
So maybe we should just fix that, probably by adding an explicit
bool flag for that condition.

BTW ... I've not looked at the callers, but now I'm wondering whether
get_variable_numdistinct ought to count NULL as one of the "distinct"
values.  In applications such as estimating the number of GROUP BY
groups, it seems like that would be correct.  There might be some
callers that don't want it though.

            regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Should the function get_variable_numdistinct consider the case when stanullfrac is 1.0?
Следующее
От: Tomas Vondra
Дата:
Сообщение: Re: [PATCH] Add extra statistics to explain for Nested Loop