Re: [PATCH] Keeps tracking the uniqueness with UniqueKey

Поиск
Список
Период
Сортировка
От Andy Fan
Тема Re: [PATCH] Keeps tracking the uniqueness with UniqueKey
Дата
Msg-id CAKU4AWpYsa-L2--qOMcJFHxzw7T5px5ZmrVaTE3MO3mW4J6uEw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [PATCH] Keeps tracking the uniqueness with UniqueKey  (Heikki Linnakangas <hlinnaka@iki.fi>)
Ответы Re: [PATCH] Keeps tracking the uniqueness with UniqueKey  (Heikki Linnakangas <hlinnaka@iki.fi>)
Re: [PATCH] Keeps tracking the uniqueness with UniqueKey  (David Rowley <dgrowleyml@gmail.com>)
Список pgsql-hackers
Thank you Heikki for your attention. 

On Mon, Nov 30, 2020 at 11:20 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
On 30/11/2020 16:30, Jesper Pedersen wrote:
> On 11/30/20 5:04 AM, Heikki Linnakangas wrote:
>> On 26/11/2020 16:58, Andy Fan wrote:
>>> This patch has stopped moving for a while,  any suggestion about
>>> how to move on is appreciated.
>>
>> The question on whether UniqueKey.exprs should be a list of
>> EquivalenceClasses or PathKeys is unresolved. I don't have an opinion
>> on that, but I'd suggest that you pick one or the other and just go
>> with it. If it turns out to be a bad choice, then we'll change it.
>
> In this case I think it is matter of deciding if we are going to use
> EquivalenceClasses or Exprs before going further; there has been work
> ongoing in this area for a while, so having a clear direction from a
> committer would be greatly appreciated.

Plain Exprs are not good enough, because you need to know which operator
the expression is unique on. Usually, it's the default = operator in the
default btree opclass for the datatype, but it could be something else, too.

Actually I can't understand this, could you explain more?  Based on my current
knowledge,  when we run "SELECT DISTINCT a FROM t",  we never care about
which operator to use for the unique. 

  
There's some precedence for PathKeys, as we generate PathKeys to
represent the DISTINCT column in PlannerInfo->distinct_pathkeys. On the
other hand, I've always found it confusing that we use PathKeys to
represent DISTINCT and GROUP BY, which are not actually sort orderings.

OK, I have the same confusion  now:)   

Perhaps it would  make sense to store EquivalenceClass+opfamily in
UniqueKey, and also replace distinct_pathkeys and group_pathkeys with
UniqueKeys.


I can understand why we need EquivalenceClass for UniqueKey, but I can't
understand why we need opfamily here. 


For anyone who is interested with these patchsets, here is my plan about this
now.  1).  I will try EquivalenceClass rather than Expr in UniqueKey and add opfamily
if needed. 2).  I will start a new thread to continue this topic. The current thread is too long
which may scare some people who may have interest in it. 3). I will give up patch 5 & 6 
for now.  one reason I am not happy with the current implementation, and the other 
reason is I want to make the patchset smaller to make the reviewer easier. I will not
give up them forever,  after the main part of this patchset is committed, I will continue
with them in a new thread. 
 
Thanks everyone for your input. 

--
Best Regards
Andy Fan

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andy Fan
Дата:
Сообщение: Re: Hybrid Hash/Nested Loop joins and caching results from subplans
Следующее
От: Peter Eisentraut
Дата:
Сообщение: Change definitions of bitmap flags to bit-shifting style