Re: Tracking notnull attributes inside Var

Поиск
Список
Период
Сортировка
От Andy Fan
Тема Re: Tracking notnull attributes inside Var
Дата
Msg-id CAKU4AWr=0rm=GoALn3dSwYT7zKFE0VHSt-sAnL1ECCPBV16MQQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Tracking notnull attributes inside Var  (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>)
Список pgsql-hackers
Hi Ashutosh:

   Nice to see you again!

On Tue, May 17, 2022 at 8:50 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote:
On Sun, May 15, 2022 at 8:41 AM Andy Fan <zhihui.fan1213@gmail.com> wrote:

>
> The var in RelOptInfo->reltarget should have nullable = 0 but the var in
> RelOptInfo->baserestrictinfo should have nullable = 1;  The beauty of this
> are: a). It can distinguish the two situations perfectly b). Whenever we want
> to know the nullable attribute of a Var for an expression, it is super easy to
> know. In summary, we need to maintain the nullable attribute at 2 different
> places. one is the before the filters are executed(baserestrictinfo, joininfo,
> ec_list at least).  one is after the filters are executed (RelOptInfo.reltarget
> only?)

Thanks for identifying this. What you have written makes sense and it
might open a few optimization opportunities. But let me put down some
other thoughts here. You might want to take those into consideration
when designing your solution.

Thanks. 
 

Do we want to just track nullable and non-nullable. May be we want
expand this class to nullable (var may be null), non-nullable (Var is
definitely non-NULL), null (Var will be always NULL).


Currently it doesn't support "Var will be always NULL" .  Do you have any
use cases for this? and I can't think of too many cases where we can get
such information except something like "SELECT a FROM t WHERE a 
IS NULL". 

But the other way to look at this is along the lines of equivalence
classes. Equivalence classes record the expressions which are equal in
the final result of the query. The equivalence class members are not
equal at all the stages of query execution.  But because they are
equal in the final result, we can impose that restriction on the lower
levels as well. Can we think of nullable in that fashion? If a Var is
non-nullable in the final result, we can impose that restriction on
the intermediate stages since rows with NULL values for that Var will
be filtered out somewhere. Similarly we could argue for null Var. But
knowledge that a Var is nullable in the final result does not impose a
NULL, non-NULL restriction on the intermediate stages. If we follow
this thought process, we don't need to differentiate Var at different
stages in query.

I agree this is an option.  If so we need to track it under the PlannerInfo 
struct but it would not be as fine-grained as my previous. Without
intermediate information,  We can't know if a UnqiueKey contains multiple
NULLs, this would not be an issue for the "MARK Distinct as no-op" case,
but I'm not sure it is OK for other UniqueKey user cases.  So my current idea
is I still prefer to maintain the intermediate information, unless we are sure it 
costs too much or it is too complex to implement which I don't think so for now
at least.  So if you have time to look at the attached patch, that would be super
great as well.

--
Best Regards
Andy Fan

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andy Fan
Дата:
Сообщение: Re: Tracking notnull attributes inside Var
Следующее
От: Tom Lane
Дата:
Сообщение: Re: 15beta1 test failure on mips in isolation/expected/stats