Re: [BUG?] estimate_hash_bucket_stats uses wrong ndistinct for avgfreq
| От | Tom Lane |
|---|---|
| Тема | Re: [BUG?] estimate_hash_bucket_stats uses wrong ndistinct for avgfreq |
| Дата | |
| Msg-id | 1657589.1772657436@sss.pgh.pa.us обсуждение исходный текст |
| Ответ на | Re: [BUG?] estimate_hash_bucket_stats uses wrong ndistinct for avgfreq ("Joel Jacobson" <joel@compiler.org>) |
| Ответы |
Re: [BUG?] estimate_hash_bucket_stats uses wrong ndistinct for avgfreq
|
| Список | pgsql-hackers |
"Joel Jacobson" <joel@compiler.org> writes:
> On Tue, Mar 3, 2026, at 16:31, Tom Lane wrote:
>> This reminds me of the unfinished business at [1]. We really ought
>> to make it true that nulls never get into the hash table before
>> we assume that's so in costing.
> Hmm, OK, so there are cases when we don't discard NULLs when we should
> be able to? I was reading these lines in nodeHash.c and thought we would
> always be discarding them when possible:
> if (!isnull)
> {
> ...
> }
> else if (node->keep_null_tuples)
> {
> /* null join key, but we must save tuple to be emitted later */
> ...
> }
> /* else we can discard the tuple immediately */
I'm confused ... that keep_null_tuples bit appears nowhere in HEAD,
but it does appear in the patch at [1].
Anyway, the short answer is that we discard NULLs if possible, but
it's not possible when doing an outer join that requires returning
null-extended rows from the hashed side.
I've now pushed the patch we were discussing before, and all that's
left to worry about (AFAIK) in estimate_hash_bucket_stats is its
handling of null join keys. I'd prefer to get the other patch
in before worrying more about that.
regards, tom lane
[1] https://www.postgresql.org/message-id/flat/3061845.1746486714%40sss.pgh.pa.us
В списке pgsql-hackers по дате отправления: