Re: [PATCH] Keeps tracking the uniqueness with UniqueKey

Поиск

Список

Период

Сортировка

От	Andy Fan
Тема	Re: [PATCH] Keeps tracking the uniqueness with UniqueKey
Дата	5 декабря 2020 г. 15:10:28
Msg-id	CAKU4AWpYsa-L2--qOMcJFHxzw7T5px5ZmrVaTE3MO3mW4J6uEw@mail.gmail.com обсуждение исходный текст
Ответ на	Re: [PATCH] Keeps tracking the uniqueness with UniqueKey (Heikki Linnakangas <hlinnaka@iki.fi>)
Ответы	Re: [PATCH] Keeps tracking the uniqueness with UniqueKey Re: [PATCH] Keeps tracking the uniqueness with UniqueKey
Список	pgsql-hackers

Дерево обсуждения

Thank you Heikki for your attention.

On Mon, Nov 30, 2020 at 11:20 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

On 30/11/2020 16:30, Jesper Pedersen wrote:
> On 11/30/20 5:04 AM, Heikki Linnakangas wrote:
>> On 26/11/2020 16:58, Andy Fan wrote:
>>> This patch has stopped moving for a while, any suggestion about
>>> how to move on is appreciated.
>>
>> The question on whether UniqueKey.exprs should be a list of
>> EquivalenceClasses or PathKeys is unresolved. I don't have an opinion
>> on that, but I'd suggest that you pick one or the other and just go
>> with it. If it turns out to be a bad choice, then we'll change it.
>
> In this case I think it is matter of deciding if we are going to use
> EquivalenceClasses or Exprs before going further; there has been work
> ongoing in this area for a while, so having a clear direction from a
> committer would be greatly appreciated.

Plain Exprs are not good enough, because you need to know which operator
the expression is unique on. Usually, it's the default = operator in the
default btree opclass for the datatype, but it could be something else, too.

Actually I can't understand this, could you explain more? Based on my current

knowledge, when we run "SELECT DISTINCT a FROM t", we never care about

which operator to use for the unique.

There's some precedence for PathKeys, as we generate PathKeys to
represent the DISTINCT column in PlannerInfo->distinct_pathkeys. On the
other hand, I've always found it confusing that we use PathKeys to
represent DISTINCT and GROUP BY, which are not actually sort orderings.

OK, I have the same confusion now:)

Perhaps it would make sense to store EquivalenceClass+opfamily in
UniqueKey, and also replace distinct_pathkeys and group_pathkeys with
UniqueKeys.

I can understand why we need EquivalenceClass for UniqueKey, but I can't

understand why we need opfamily here.

For anyone who is interested with these patchsets, here is my plan about this

now. 1). I will try EquivalenceClass rather than Expr in UniqueKey and add opfamily

if needed. 2). I will start a new thread to continue this topic. The current thread is too long

which may scare some people who may have interest in it. 3). I will give up patch 5 & 6

for now. one reason I am not happy with the current implementation, and the other

reason is I want to make the patchset smaller to make the reviewer easier. I will not

give up them forever, after the main part of this patchset is committed, I will continue

with them in a new thread.

Thanks everyone for your input.

Best Regards

Andy Fan

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: [PATCH] Keeps tracking the uniqueness with UniqueKey