Re: [BUG?] estimate_hash_bucket_stats uses wrong ndistinct for avgfreq

Поиск

Список

Период

Сортировка

От	Tom Lane
Тема	Re: [BUG?] estimate_hash_bucket_stats uses wrong ndistinct for avgfreq
Дата	4 марта 23:50:36
Msg-id	1657589.1772657436@sss.pgh.pa.us обсуждение исходный текст
Ответ на	Re: [BUG?] estimate_hash_bucket_stats uses wrong ndistinct for avgfreq ("Joel Jacobson" <joel@compiler.org>)
Ответы	Re: [BUG?] estimate_hash_bucket_stats uses wrong ndistinct for avgfreq
Список	pgsql-hackers

Дерево обсуждения

"Joel Jacobson" <joel@compiler.org> writes:
> On Tue, Mar 3, 2026, at 16:31, Tom Lane wrote:
>> This reminds me of the unfinished business at [1].  We really ought
>> to make it true that nulls never get into the hash table before
>> we assume that's so in costing.

> Hmm, OK, so there are cases when we don't discard NULLs when we should
> be able to? I was reading these lines in nodeHash.c and thought we would
> always be discarding them when possible:

>         if (!isnull)
>         {
> ...
>         }
>         else if (node->keep_null_tuples)
>         {
>             /* null join key, but we must save tuple to be emitted later */
> ...
>         }
>         /* else we can discard the tuple immediately */

I'm confused ... that keep_null_tuples bit appears nowhere in HEAD,
but it does appear in the patch at [1].

Anyway, the short answer is that we discard NULLs if possible, but
it's not possible when doing an outer join that requires returning
null-extended rows from the hashed side.

I've now pushed the patch we were discussing before, and all that's
left to worry about (AFAIK) in estimate_hash_bucket_stats is its
handling of null join keys.  I'd prefer to get the other patch
in before worrying more about that.

            regards, tom lane

[1] https://www.postgresql.org/message-id/flat/3061845.1746486714%40sss.pgh.pa.us

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: [BUG?] estimate_hash_bucket_stats uses wrong ndistinct for avgfreq