Re: Query plan prefers hash join when nested loop is much faster

Поиск

Список

Период

Сортировка

От	iulian dragos
Тема	Re: Query plan prefers hash join when nested loop is much faster
Дата	25 августа 2020 г. 12:01:18
Msg-id	CAMNsu3k_MtN3L+JAsn_gpS_b3gpCrhJ=x1JvEKL0LK-07H14SQ@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Query plan prefers hash join when nested loop is much faster (David Rowley <dgrowleyml@gmail.com>)
Список	pgsql-general

Дерево обсуждения

On Tue, Aug 25, 2020 at 12:36 PM David Rowley <dgrowleyml@gmail.com> wrote:

On Tue, 25 Aug 2020 at 22:10, iulian dragos
<iulian.dragos@databricks.com> wrote:
> Thanks for the tip! Indeed, `n_distinct` isn't right. I found it in pg_stats set at 131736.0, but the actual number is much higher: 210104361. I tried to set it manually, but the plan is still the same (both the actual number and a percentage, -0.4, as you suggested):

You'll need to run ANALYZE on the table after doing the ALTER TABLE to
change the n_distinct. The ANALYZE writes the value to pg_statistic.
ALTER TABLE only takes it as far as pg_attribute's attoptions.
ANALYZE reads that column to see if the n_distinct estimate should be
overwritten before writing out pg_statistic

Ah, rookie mistake. Thanks for clarifying this. Indeed, after I ran ANALYZE the faster plan was selected! Yay!

Just remember if you're hardcoding a positive value that it'll stay
fixed until you change it. If the table is likely to grow, then you
might want to reconsider using a positive value and consider using a
negative value as mentioned in the doc link.

Good point, I went for -0.4 and that seems to be doing the trick!

Thanks a lot for helping out!

David

В списке pgsql-general по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Query plan prefers hash join when nested loop is much faster