Re: Remove useless GROUP BY columns considering unique index

Поиск

Список

Период

Сортировка

От	David Rowley
Тема	Re: Remove useless GROUP BY columns considering unique index
Дата	12 сентября 2024 г. 04:43:42
Msg-id	CAApHDvqHW0wo8rFzrtMgYa=bkxjfKOxSjiz_QqO5najWHjoRRg@mail.gmail.com обсуждение исходный текст
Ответ на	Remove useless GROUP BY columns considering unique index (Zhang Mingli <zmlpostgres@gmail.com>)
Ответы	Re: Remove useless GROUP BY columns considering unique index
Список	pgsql-hackers

Дерево обсуждения

On Sat, 30 Dec 2023 at 04:05, Zhang Mingli <zmlpostgres@gmail.com> wrote:
> So my patch make it easy: check unique index’s columns, it’s a valid candidate if all of that have NOT NULL
constraint.
> And we choose a best one who has the least column numbers in get_min_unique_not_null_attnos(), as the reason: less
columnsmean that more group by columns could be removed.

This patch no longer applies. We no longer catalogue NOT NULL
constraints, which this patch is coded to rely upon. (Likely it could
just look at pg_attribute.attnotnull instead)

Aside from that, I'm not sure about the logic to prefer removing
functionally dependent columns using the constraint with the least
columns. If you had the PRIMARY KEY on two int columns and a unique
index on a 1MB varlena column, I think using the primary key would be
a better choice to remove functionally dependent columns of. Maybe
it's ok to remove any functionally dependant columns on the primary
key first and try removing functionally dependent columns on unique
constraints and a 2nd step (or maybe only if none can be removed using
the PK?)

Also, why constraints rather than unique indexes? You'd need to check
for expression indexes and likely ignore those due to no ability to
detect NOT NULL for expressions.

Also, looking at the patch to how you've coded
get_min_unique_not_null_attnos(), it looks like you're very optimistic
about that being a constraint that has columns mentioned in the GROUP
BY clause. It looks like it won't work if the UNIQUE constraint with
the least columns gets no mention in the GROUP BY clause. That'll
cause performance regressions from what we have today when the primary
key is mentioned and columns can be removed using that but a UNIQUE
constraint exists which has no corresponding GROUP BY columns and
fewer unique columns than the PK. That's not good and it'll need to be
fixed.

Set to waiting on author.

David

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Remove useless GROUP BY columns considering unique index