Re: [PATCH] Keeps tracking the uniqueness with UniqueKey

Поиск
Список
Период
Сортировка
От Andy Fan
Тема Re: [PATCH] Keeps tracking the uniqueness with UniqueKey
Дата
Msg-id CAKU4AWo06r1H1tpj_W4RCUFOPufgZtcqoLeFCdt74LWS1WyGyQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [PATCH] Keeps tracking the uniqueness with UniqueKey  (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>)
Ответы Re: [PATCH] Keeps tracking the uniqueness with UniqueKey  (David Rowley <dgrowleyml@gmail.com>)
Re: [PATCH] Keeps tracking the uniqueness with UniqueKey  (Andy Fan <zhihui.fan1213@gmail.com>)
Список pgsql-hackers


On Wed, May 13, 2020 at 8:04 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote:
On Fri, May 8, 2020 at 7:27 AM Andy Fan <zhihui.fan1213@gmail.com> wrote:

>> +            else if (inner_is_onerow)
>> +            {
>> +                /* Since rows in innerrel can't be duplicated AND if
>> innerrel is onerow,
>> +                 * the join result will be onerow also as well. Note:
>> onerow implies
>> +                 * multi_nullvals = false.
>>
>> I don't understand this comment. Why is there only one row in the other
>> relation which can join to this row?
>
>
> I guess you may miss the onerow special case if I understand correctly.
> inner_is_onerow means something like "SELECT xx FROM t1 where uk = 1".
> innerrel can't be duplicated means:  t1.y = t2.pk;  so the finally result is onerow
> as well.  One of the overall query is  SELECT .. FROM t1, t2 where t2.y = t2.pk;
>
>
> I explained more about onerow in the v7 README.unqiuekey document, just copy
> it here.
For some reason this mail remained in my drafts without being sent.
Sending it now. Sorry.

My impression about the one row stuff, is that there is too much
special casing around it. We should somehow structure the UniqueKey
data so that one row unique keys come naturally rather than special
cased. E.g every column in such a case is unique in the result so
create as many UniqueKeys are the number of columns

This is the beginning state of the UniqueKey,  later David suggested
this as an optimization[1], I buy-in the idea and later I found it mean
more than the original one [2], so I think onerow is needed actually.  
 
or create one
unique key with no column as you have done but handle it more
gracefully rather than spreading it all over the place.

I think this is what I do now, but it is possible that I spread it more than
necessary, if so, please let me know.  I maintained the README.uniquekey
carefully since v7 and improved a lot in v8, it may be a good place to check
it. 

 
Also, the amount of code that these patches changes seems to be much
larger than the feature's worth arguably. But it indicates that we are
modifying/adding more code than necessary. Some of that code can be
merged into existing code which does similar things as I have pointed
out in my previous comment.


I have reused the code select_mergejoin_clause rather than maintaining my
own copies in v8. Thanks a lot about that suggestion.  This happened mainly
because I didn't read enough code.  I will go through more to see if I have similar
issues. 
 
Thanks for working on the expanded scope of the initial feature you
proposed. But it makes the feature more useful, I think.

That's mainly because your suggestions are always insightful which makes me
willing to continue to work on it, so thank you all!

===
In fact, I was hesitated that how to reply an email when I send an new version
of the patch.  One idea is I should explain clear what is the difference between Vn
and Vn-1.  The other one is not many people read the Vn-1, so I would like to keep
the email simplified and keep the README clearly to save the reviewer's time. 
Actually there are more changes in v8 than I stated above. for example:  for the
UniqueKey on baserelation,  we will reduce the expr from the UniqueKey if the 
expr is a const.   Unique on (A, B).   query is SELECT b FROM t WHERE a = 1;   
in v7, the UniqueKey is (a, b).  In v8, the UniqueKey is (b) only.  But since most
people still not start to read it, so I add such  information to README rather than 
echo the same in email thread.  I will try more to understand how to communicate more
smooth.  But any suggestion on this part is appreciated. 

Best Regards
Andy Fan

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Dilip Kumar
Дата:
Сообщение: Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions
Следующее
От: Andy Fan
Дата:
Сообщение: Re: [PATCH] Keeps tracking the uniqueness with UniqueKey