Re: [PATCH] Erase the distinctClause if the result is unique by definition

Поиск
Список
Период
Сортировка
От Ashutosh Bapat
Тема Re: [PATCH] Erase the distinctClause if the result is unique by definition
Дата
Msg-id CAExHW5sG2Q7aPAh4vpk85QhnuFfDBJYc3yFNGb43x6vc498rsA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [PATCH] Erase the distinctClause if the result is unique by definition  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers


On Mon, Feb 10, 2020 at 10:57 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> writes:
>> On Sat, Feb 8, 2020 at 12:53 PM Andy Fan <zhihui.fan1213@gmail.com> wrote:
>> Do you mean adding some information into PlannerInfo,  and when we create
>> a node for Unique/HashAggregate/Group,  we can just create a dummy node?

> Not so much as PlannerInfo but something on lines of PathKey. See PathKey
> structure and related code. What I envision is PathKey class is also
> annotated with the information whether that PathKey implies uniqueness.
> E.g. a PathKey derived from a Primary index would imply uniqueness also. A
> PathKey derived from say Group operation also implies uniqueness. Then just
> by looking at the underlying Path we would be able to say whether we need
> Group/Unique node on top of it or not. I think that would make it much
> wider usecase and a very useful optimization.

FWIW, that doesn't seem like a very prudent approach to me, because it
confuses sorted-ness with unique-ness.  PathKeys are about sorting,
but it's possible to have uniqueness guarantees without having sorted
anything, for instance via hashed grouping.

I haven't looked at this patch, but I'd expect it to use infrastructure
related to query_is_distinct_for(), and that doesn't deal in PathKeys.

Thanks for the pointer. I think there's another problem with my approach. PathKeys are specific to paths since the order of the result depends upon the Path. But uniqueness is a property of the result i.e. relation and thus should be attached to RelOptInfo as query_is_distinct_for() does. I think uniquness should bubble up the RelOptInfo tree, annotating each RelOptInfo with the minimum set of TLEs which make the result from that relation unique. Thus we could eliminate extra Group/Unique node if the underlying RelOptInfo's unique column set is subset of required uniqueness.
--
--
Best Wishes,
Ashutosh Bapat

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager
Следующее
От: Ashutosh Bapat
Дата:
Сообщение: Re: [PATCH] Erase the distinctClause if the result is unique by definition