Re: [PATCH] Keeps tracking the uniqueness with UniqueKey

Поиск
Список
Период
Сортировка
От David Rowley
Тема Re: [PATCH] Keeps tracking the uniqueness with UniqueKey
Дата
Msg-id CAApHDvrksJ1K45A6nhE_8h-g=8nFK0PU84Q52KK3Cmm9aezWkw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [PATCH] Keeps tracking the uniqueness with UniqueKey  (Andy Fan <zhihui.fan1213@gmail.com>)
Ответы Re: [PATCH] Keeps tracking the uniqueness with UniqueKey  (Andy Fan <zhihui.fan1213@gmail.com>)
Список pgsql-hackers
On Thu, 14 May 2020 at 03:48, Andy Fan <zhihui.fan1213@gmail.com> wrote:
> On Wed, May 13, 2020 at 8:04 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote:
>> My impression about the one row stuff, is that there is too much
>> special casing around it. We should somehow structure the UniqueKey
>> data so that one row unique keys come naturally rather than special
>> cased. E.g every column in such a case is unique in the result so
>> create as many UniqueKeys are the number of columns
>
>
> This is the beginning state of the UniqueKey,  later David suggested
> this as an optimization[1], I buy-in the idea and later I found it mean
> more than the original one [2], so I think onerow is needed actually.

Having the "onerow" flag was not how I intended it to work.

Here's an example of how I thought it should work:

Assume t1 has UniqueKeys on {a}

SELECT DISTINCT a,b FROM t1;

Here the DISTINCT can be a no-op due to "a" being unique within t1. Or
more basically, {a} is a subset of {a,b}.

The code which does this is relation_has_uniquekeys_for(), which
contains the code:

+ if (list_is_subset(ukey->exprs, exprs))
+ return true;

In this case, ukey->exprs is {a} and exprs is {a,b}. So, if the
UniqueKey's exprs are a subset of, in this case, the DISTINCT exprs
then relation_has_uniquekeys_for() returns true. Basically
list_is_subset({a}, {a,b}), Answer: "Yes".

For the onerow stuff, if we can prove the relation returns only a
single row, e.g an aggregate without a GROUP BY, or there are
EquivalenceClasses with ec_has_const == true for each key of a unique
index, then why can't set just set the UniqueKeys to {}?  That would
mean the code to determine if we can avoid performing an explicit
DISTINCT operation would be called with list_is_subset({}, {a,b}),
which is also true, in fact, an empty set is a subset of any set. Why
is there a need to special case that fact?

In light of those thoughts, can you explain why you think we need to
keep the onerow flag?

David



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: new heapcheck contrib module
Следующее
От: Tom Lane
Дата:
Сообщение: Re: pgstat_read_statsfiles() and reset timestamp