Re: Remove useless GROUP BY columns considering unique index
От | David Rowley |
---|---|
Тема | Re: Remove useless GROUP BY columns considering unique index |
Дата | |
Msg-id | CAApHDvqHW0wo8rFzrtMgYa=bkxjfKOxSjiz_QqO5najWHjoRRg@mail.gmail.com обсуждение исходный текст |
Ответ на | Remove useless GROUP BY columns considering unique index (Zhang Mingli <zmlpostgres@gmail.com>) |
Ответы |
Re: Remove useless GROUP BY columns considering unique index
|
Список | pgsql-hackers |
On Sat, 30 Dec 2023 at 04:05, Zhang Mingli <zmlpostgres@gmail.com> wrote: > So my patch make it easy: check unique index’s columns, it’s a valid candidate if all of that have NOT NULL constraint. > And we choose a best one who has the least column numbers in get_min_unique_not_null_attnos(), as the reason: less columnsmean that more group by columns could be removed. This patch no longer applies. We no longer catalogue NOT NULL constraints, which this patch is coded to rely upon. (Likely it could just look at pg_attribute.attnotnull instead) Aside from that, I'm not sure about the logic to prefer removing functionally dependent columns using the constraint with the least columns. If you had the PRIMARY KEY on two int columns and a unique index on a 1MB varlena column, I think using the primary key would be a better choice to remove functionally dependent columns of. Maybe it's ok to remove any functionally dependant columns on the primary key first and try removing functionally dependent columns on unique constraints and a 2nd step (or maybe only if none can be removed using the PK?) Also, why constraints rather than unique indexes? You'd need to check for expression indexes and likely ignore those due to no ability to detect NOT NULL for expressions. Also, looking at the patch to how you've coded get_min_unique_not_null_attnos(), it looks like you're very optimistic about that being a constraint that has columns mentioned in the GROUP BY clause. It looks like it won't work if the UNIQUE constraint with the least columns gets no mention in the GROUP BY clause. That'll cause performance regressions from what we have today when the primary key is mentioned and columns can be removed using that but a UNIQUE constraint exists which has no corresponding GROUP BY columns and fewer unique columns than the PK. That's not good and it'll need to be fixed. Set to waiting on author. David
В списке pgsql-hackers по дате отправления: