Re: Really bad blowups with hash outer join and nulls

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Really bad blowups with hash outer join and nulls
Дата
Msg-id 17667.1424014405@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Really bad blowups with hash outer join and nulls  (Andrew Gierth <andrew@tao11.riddles.org.uk>)
Ответы Re: Really bad blowups with hash outer join and nulls
Список pgsql-hackers
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
> A quick test suggests that initializing the hash value to ~0 rather than
> 0 has a curious effect: the number of batches still explodes, but the
> performance does not suffer the same way. (I think because almost all
> the batches end up empty.) I think this is worth doing even in the
> absence of a more general solution; nulls are common enough and
> important enough that they shouldn't be the worst-case value if it can
> be avoided.

I think that's unlikely to be a path to a good solution.

At least part of the problem here is that estimate_hash_bucketsize()
supposes that nulls can be ignored --- which is normally true, and
invalidates your claim that they're common.  But in a RIGHT JOIN
situation, they need to be considered as if they were regular keys.
That would probably be sufficient to dissuade the planner from choosing
a hash join in this example.

There may also be something we can do in the executor, but it would
take closer analysis to figure out what's going wrong.  I don't think
kluging the behavior for NULL in particular is the answer.
        regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Petr Jelinek
Дата:
Сообщение: Re: Logical Replication Helpers WIP for discussion
Следующее
От: Florian Pflug
Дата:
Сообщение: Re: restrict global access to be readonly