Re: Useless memoize path generated for unique join on primary keys

Поиск

Список

Период

Сортировка

От	David Rowley
Тема	Re: Useless memoize path generated for unique join on primary keys
Дата	3 мая 2022 г. 14:43:09
Msg-id	CAApHDvpPvJ7HwKRPTXhrGbV1bERVQmnf819a6sPLirOBK=4Q-g@mail.gmail.com обсуждение исходный текст
Ответ на	Useless memoize path generated for unique join on primary keys (Benjamin Coutu <ben.coutu@zeyos.com>)
Ответы	Re: Useless memoize path generated for unique join on primary keys (Benjamin Coutu <ben.coutu@zeyos.com>)
Список	pgsql-performance

Дерево обсуждения

On Tue, 3 May 2022 at 23:05, Benjamin Coutu <ben.coutu@zeyos.com> wrote:
>   ->  Memoize  (cost=0.58..0.67 rows=1 width=16) (actual time=0.002..0.002 rows=0 loops=4067215)
>         Cache Key: e2.field, e2.index
>         Cache Mode: logical
>         Hits: 0  Misses: 4067215  Evictions: 3228355  Overflows: 0  Memory Usage: 65537kB
>         Buffers: shared hit=16268863
>         ->  Index Only Scan using pk_extdataregular on extdataregular e1  (cost=0.57..0.66 rows=1 width=16) (actual
time=0.001..0.001rows=0 loops=4067215)

>               Index Cond: ((field = e2.field) AND (index = e2.index))

> Why would it ever make sense to generate a memoize plan for a unique join?

It wouldn't ever make sense.

The problem is that estimate_num_groups() is used to estimate the
number of distinct values and that function does not know about
primary keys. There's no way the costing of Memoize would allow a
Memoize plan to be used if it thought all values were unique, so the
only possibility here is that ndistinct is being underestimated by
some amount that makes Memoize look like the most favourable plan.

You could see what the planner thinks about the ndistinct estimate on
field, index by doing:

EXPLAIN SELECT field,index FROM extdataregular GROUP BY 1,2;

Whatever you see in the final row estimate for that plan is what's
being fed into the Memoize costing code.

> I think this issue might tie in with the current discussion over on the hackers mailing list [1]

I'd say it's a pretty different problem. The cache hit ratio
discussion on that thread talks about underestimating the hit ratio.
That particular problem could only lead to Memoize plans *not* being
chosen when they maybe should be. Not the other way around, which is
your case.

create statistics extdataregular_field_index_stats (ndistinct) on
field, index from extdataregular;
analyze extdataregular;

would likely put that right.

David

В списке pgsql-performance по дате отправления:

Предыдущее

От: Benjamin Coutu
Дата: 03 мая 2022 г., 14:05:30
Сообщение: Useless memoize path generated for unique join on primary keys

Следующее

От: Benjamin Coutu
Дата: 03 мая 2022 г., 15:21:46
Сообщение: Re: Useless memoize path generated for unique join on primary keys

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Useless memoize path generated for unique join on primary keys

Предыдущее

Следующее