Re: [PATCH] Keeps tracking the uniqueness with UniqueKey

Поиск
Список
Период
Сортировка
От Andy Fan
Тема Re: [PATCH] Keeps tracking the uniqueness with UniqueKey
Дата
Msg-id CAKU4AWoJXCyh3LOShOb5bRUM2bguGDT=HQ6Wadnd2B0L-ohtsQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [PATCH] Keeps tracking the uniqueness with UniqueKey  (David Rowley <dgrowleyml@gmail.com>)
Список pgsql-hackers


On Fri, Jun 5, 2020 at 10:57 AM David Rowley <dgrowleyml@gmail.com> wrote:
On Fri, 5 Jun 2020 at 14:36, Andy Fan <zhihui.fan1213@gmail.com> wrote:
> On Mon, May 25, 2020 at 2:34 AM David Rowley <dgrowleyml@gmail.com> wrote:
>>
>> On Sun, 24 May 2020 at 04:14, Dmitry Dolgov <9erthalion6@gmail.com> wrote:
>> >
>> > > On Fri, May 22, 2020 at 08:40:17AM +1200, David Rowley wrote:
>> > > I imagine we'll set some required UniqueKeys during
>> > > standard_qp_callback()
>> >
>> > In standard_qp_callback, because pathkeys are computed at this point I
>> > guess?
>>
>> Yes. In particular, we set the pathkeys for DISTINCT clauses there.
>>
>
> Actually I have some issues to understand from here,  then try to read index
> skip scan patch to fully understand what is the requirement, but that doesn't
> get it so far[1].  So what  is the "UniqueKeys" in "UniqueKeys during
> standard_qp_callback()" and what is the "pathkeys" in "pathkeys are computed
> at this point” means?  I tried to think it as root->distinct_pathkeys,  however I
> didn't fully understand where root->distinct_pathkeys is used for as well.

In standard_qp_callback(), what we'll do with uniquekeys is pretty
much what we already do with pathkeys there. Basically pathkeys are
set there to have the planner attempt to produce a plan that satisfies
those pathkeys.  Notice at the end of standard_qp_callback() we set
the pathkeys according to the first upper planner operation that'll
need to make use of those pathkeys.  e.g, If there's a GROUP BY and a
DISTINCT in the query, then use the pathkeys for GROUP BY, since that
must occur before DISTINCT. 

Thanks for your explanation.  Looks I understand now based on your comments.
Take root->group_pathkeys for example,  the similar information also available in 
root->parse->groupClauses but we do use of root->group_pathkeys  with 
pathkeys_count_contained_in function in many places, that is mainly because 
the content between between the 2 is different some times, like the case in
pathkey_is_redundant. 

Likely uniquekeys will want to follow the
same rules there for the operations that can make use of paths with
uniquekeys, which in this case, I believe, will be the same as the
example I just mentioned for pathkeys, except we'll only be able to
support GROUP BY without any aggregate functions.


All the places I want to use UniqueKey so far (like distinct, group by and others)
have an input_relation (RelOptInfo),  and the UniqueKey information can be get
there.  at the same time,  all the pathkey in PlannerInfo is used for Upper planner
but UniqueKey may be used in current planner some time, like reduce_semianti_joins/
remove_useless_join, I am not sure if we must maintain uniquekey in PlannerInfo. 

--
Best Regards
Andy Fan

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andy Fan
Дата:
Сообщение: Re: A wrong index choose issue because of inaccurate statistics
Следующее
От: Thomas Munro
Дата:
Сообщение: Re: BufFileRead() error signalling