Обсуждение: review: Non-recursive processing of AND/OR lists

Поиск
Список
Период
Сортировка

review: Non-recursive processing of AND/OR lists

От
Pavel Stehule
Дата:
Hello

related to

https://commitfest.postgresql.org/action/patch_view?id=1130
http://www.postgresql.org/message-id/CABwTF4V9rsjiBWE+87pK83Mmm7ACdrG7sZ08RQ-4qYMe8jvhbw@mail.gmail.com


* motivation: remove recursive procession of AND/OR list (hangs with
10062 and more subexpressions)

* patch is short, clean and respect postgresql source code requirements
* patch was applied cleanly without warnings
* all regression tests was passed
* I successfully evaluated expression with 100000 subexpressions
* there is no significant slowdown

possible improvements

a = (A_Expr*) list_nth(pending, 0);

a = (A_Expr*) linitial(pending);

not well comment

should be -- "If the right branch is also an SAME condition, append it to the"

+            /*
+             * If the right branch is also an AND condition, append it to the
+             * pending list, to be processed later. This allows us to walk even
+             * bushy trees, not just left-deep trees.
+             */
+            if (IsA(a->rexpr, A_Expr) && ((A_Expr*)a->rexpr)->kind == root_kind)
+            {
+                pending = lappend(pending, a->rexpr);
+            }
+            else
+            {
+                expr = transformExprRecurse(pstate, a->rexpr);
+                expr = coerce_to_boolean(pstate, expr, root_kind == AEXPR_AND ?
"AND" : "OR");
+                exprs = lcons(expr, exprs);
+            }

I don't see any other issues, so after fixing comments this patch is
ready for commit

Regards

Pavel Stehule



Re: review: Non-recursive processing of AND/OR lists

От
Gurjeet Singh
Дата:
Thanks for the review Pavel.<br /><br /><div class="gmail_quote">On Tue, Jun 18, 2013 at 3:01 PM, Pavel Stehule <span
dir="ltr"><<ahref="mailto:pavel.stehule@gmail.com" target="_blank">pavel.stehule@gmail.com</a>></span> wrote:<br
/><blockquoteclass="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hello<br /><br
/>related to<br /><br /><a href="https://commitfest.postgresql.org/action/patch_view?id=1130"
target="_blank">https://commitfest.postgresql.org/action/patch_view?id=1130</a><br/><a
href="http://www.postgresql.org/message-id/CABwTF4V9rsjiBWE+87pK83Mmm7ACdrG7sZ08RQ-4qYMe8jvhbw@mail.gmail.com"
target="_blank">http://www.postgresql.org/message-id/CABwTF4V9rsjiBWE+87pK83Mmm7ACdrG7sZ08RQ-4qYMe8jvhbw@mail.gmail.com</a><br
/><br/><br /> * motivation: remove recursive procession of AND/OR list (hangs with<br /> 10062 and more
subexpressions)<br/><br /> * patch is short, clean and respect postgresql source code requirements<br /> * patch was
appliedcleanly without warnings<br /> * all regression tests was passed<br /> * I successfully evaluated expression
with100000 subexpressions<br /> * there is no significant slowdown<br /><br /> possible improvements<br /><br /> a =
(A_Expr*)list_nth(pending, 0);<br /><br /> a = (A_Expr*) linitial(pending);<br /><br /> not well comment<br /><br />
shouldbe -- "If the right branch is also an SAME condition, append it to the"<br /><br /> +                       /*<br
/>+                        * If the right branch is also an AND condition, append it to the<br /> +                    
  * pending list, to be processed later. This allows us to walk even<br /> +                        * bushy trees, not
justleft-deep trees.<br /> +                        */<br /> +                       if (IsA(a->rexpr, A_Expr)
&&((A_Expr*)a->rexpr)->kind == root_kind)<br /> +                       {<br /> +                        
     pending = lappend(pending, a->rexpr);<br /> +                       }<br /> +                       else<br />
+                      {<br /> +                               expr = transformExprRecurse(pstate, a->rexpr);<br />
+                              expr = coerce_to_boolean(pstate, expr, root_kind == AEXPR_AND ?<br /> "AND" : "OR");<br
/>+                               exprs = lcons(expr, exprs);<br /> +                       }<br /><br /> I don't see
anyother issues, so after fixing comments this patch is<br /> ready for commit<br /><br /> Regards<br /><span
class="HOEnZb"><fontcolor="#888888"><br /> Pavel Stehule<br /></font></span></blockquote></div><br /><br clear="all"
/><br/>-- <br /><div dir="ltr">Gurjeet Singh<br /><br /><a href="http://gurjeet.singh.im/"
target="_blank">http://gurjeet.singh.im/</a><br/><br />EnterpriseDB Inc.<br /></div> 

Re: review: Non-recursive processing of AND/OR lists

От
Gurjeet Singh
Дата:
On Tue, Jun 18, 2013 at 3:01 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:

related to

https://commitfest.postgresql.org/action/patch_view?id=1130
http://www.postgresql.org/message-id/CABwTF4V9rsjiBWE+87pK83Mmm7ACdrG7sZ08RQ-4qYMe8jvhbw@mail.gmail.com


* motivation: remove recursive procession of AND/OR list (hangs with
10062 and more subexpressions)

* patch is short, clean and respect postgresql source code requirements
* patch was applied cleanly without warnings
* all regression tests was passed
* I successfully evaluated expression with 100000 subexpressions
* there is no significant slowdown

possible improvements

a = (A_Expr*) list_nth(pending, 0);

a = (A_Expr*) linitial(pending);

I made that change, hesitantly. The comments above definition of linitial() macro describe the confusion that API causes. I wanted to avoid that confusion for new code, so I used the newer API which makes the intention quite clear. But looking at that code closely, list_nth() causes at least 2 function calls, and that's pretty heavy compared to the linitiali() macro.
 

not well comment

should be -- "If the right branch is also an SAME condition, append it to the"

I moved that comment above the outer bock, so that the intention of the whole do-while code block is described in one place.

I don't see any other issues, so after fixing comments this patch is
ready for commit

Thanks for the review Pavel.

Attached is the updated patch, v4. It has the above edits, and a few code improvements, like not repeating the (root_kind == AEPR_AND ? .. :  ..) ternary expression.

Best regards,
--
Gurjeet Singh

http://gurjeet.singh.im/

EnterpriseDB Inc.
Вложения

Re: review: Non-recursive processing of AND/OR lists

От
Pavel Stehule
Дата:
Hello

just one small notices

I dislike a name "root_bool_expr", because, there is not a expression,
but expression type. Can you use "root_bool_expr_type" instead? It is
little bit longer, but more correct. Same not best name is
"root_char", maybe "root_bool_op_name"

or root_expr_type and root_op_name ???

Have no other comments

Regards

Pavel

2013/6/30 Gurjeet Singh <gurjeet@singh.im>:
> On Tue, Jun 18, 2013 at 3:01 PM, Pavel Stehule <pavel.stehule@gmail.com>
> wrote:
>>
>>
>> related to
>>
>> https://commitfest.postgresql.org/action/patch_view?id=1130
>>
>> http://www.postgresql.org/message-id/CABwTF4V9rsjiBWE+87pK83Mmm7ACdrG7sZ08RQ-4qYMe8jvhbw@mail.gmail.com
>>
>>
>> * motivation: remove recursive procession of AND/OR list (hangs with
>> 10062 and more subexpressions)
>>
>> * patch is short, clean and respect postgresql source code requirements
>> * patch was applied cleanly without warnings
>> * all regression tests was passed
>> * I successfully evaluated expression with 100000 subexpressions
>> * there is no significant slowdown
>>
>> possible improvements
>>
>> a = (A_Expr*) list_nth(pending, 0);
>>
>> a = (A_Expr*) linitial(pending);
>
>
> I made that change, hesitantly. The comments above definition of linitial()
> macro describe the confusion that API causes. I wanted to avoid that
> confusion for new code, so I used the newer API which makes the intention
> quite clear. But looking at that code closely, list_nth() causes at least 2
> function calls, and that's pretty heavy compared to the linitiali() macro.
>
>>
>>
>> not well comment
>>
>> should be -- "If the right branch is also an SAME condition, append it to
>> the"
>
>
> I moved that comment above the outer bock, so that the intention of the
> whole do-while code block is described in one place.
>
>> I don't see any other issues, so after fixing comments this patch is
>> ready for commit
>
>
> Thanks for the review Pavel.
>
> Attached is the updated patch, v4. It has the above edits, and a few code
> improvements, like not repeating the (root_kind == AEPR_AND ? .. :  ..)
> ternary expression.
>
> Best regards,
> --
> Gurjeet Singh
>
> http://gurjeet.singh.im/
>
> EnterpriseDB Inc.



Re: review: Non-recursive processing of AND/OR lists

От
Gurjeet Singh
Дата:
On Sun, Jun 30, 2013 at 11:13 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
Hello

just one small notices

I dislike a name "root_bool_expr", because, there is not a expression,
but expression type. Can you use "root_bool_expr_type" instead? It is
little bit longer, but more correct. Same not best name is
"root_char", maybe "root_bool_op_name"

or root_expr_type and root_op_name ???

How about naming those 3 variables as follows:

root_expr_kind
root_expr_name
root_bool_expr_type


--
Gurjeet Singh

http://gurjeet.singh.im/

EnterpriseDB Inc.

Re: review: Non-recursive processing of AND/OR lists

От
Pavel Stehule
Дата:
2013/6/30 Gurjeet Singh <gurjeet@singh.im>:
> On Sun, Jun 30, 2013 at 11:13 AM, Pavel Stehule <pavel.stehule@gmail.com>
> wrote:
>>
>> Hello
>>
>> just one small notices
>>
>> I dislike a name "root_bool_expr", because, there is not a expression,
>> but expression type. Can you use "root_bool_expr_type" instead? It is
>> little bit longer, but more correct. Same not best name is
>> "root_char", maybe "root_bool_op_name"
>>
>> or root_expr_type and root_op_name ???
>
>
> How about naming those 3 variables as follows:
>
> root_expr_kind
> root_expr_name
> root_bool_expr_type

+1

Pavel

>
>
> --
> Gurjeet Singh
>
> http://gurjeet.singh.im/
>
> EnterpriseDB Inc.



Re: review: Non-recursive processing of AND/OR lists

От
Gurjeet Singh
Дата:
On Sun, Jun 30, 2013 at 11:46 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
2013/6/30 Gurjeet Singh <gurjeet@singh.im>:
> On Sun, Jun 30, 2013 at 11:13 AM, Pavel Stehule <pavel.stehule@gmail.com>
> wrote:
>
> How about naming those 3 variables as follows:
>
> root_expr_kind
> root_expr_name
> root_bool_expr_type

+1

Thanks. Attached is the patch with that change. I'll update the commitfest entry with a link to this email.

--
Gurjeet Singh

http://gurjeet.singh.im/

EnterpriseDB Inc.

Fwd: review: Non-recursive processing of AND/OR lists

От
Gurjeet Singh
Дата:
On Sun, Jun 30, 2013 at 11:46 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
2013/6/30 Gurjeet Singh <gurjeet@singh.im>:
> On Sun, Jun 30, 2013 at 11:13 AM, Pavel Stehule <pavel.stehule@gmail.com>
> wrote:
>
> How about naming those 3 variables as follows:
>
> root_expr_kind
> root_expr_name
> root_bool_expr_type

+1

Thanks. Attached is the patch with that change. I'll update the commitfest entry with a link to this email.

--
Gurjeet Singh

http://gurjeet.singh.im/

EnterpriseDB Inc.



--
Gurjeet Singh

http://gurjeet.singh.im/

EnterpriseDB Inc.
Вложения

Re: review: Non-recursive processing of AND/OR lists

От
Pavel Stehule
Дата:
2013/6/30 Gurjeet Singh <gurjeet@singh.im>:
> On Sun, Jun 30, 2013 at 11:46 AM, Pavel Stehule <pavel.stehule@gmail.com>
> wrote:
>>
>> 2013/6/30 Gurjeet Singh <gurjeet@singh.im>:
>> > On Sun, Jun 30, 2013 at 11:13 AM, Pavel Stehule
>> > <pavel.stehule@gmail.com>
>> > wrote:
>> >
>> > How about naming those 3 variables as follows:
>> >
>> > root_expr_kind
>> > root_expr_name
>> > root_bool_expr_type
>>
>> +1
>
>
> Thanks. Attached is the patch with that change. I'll update the commitfest
> entry with a link to this email.

ok

I chechecked it - patched without warnings, all tests passed

It is ready for commit

Regards

Pavel

>
> --
> Gurjeet Singh
>
> http://gurjeet.singh.im/
>
> EnterpriseDB Inc.



Re: review: Non-recursive processing of AND/OR lists

От
Robert Haas
Дата:
On Sun, Jun 30, 2013 at 1:08 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> 2013/6/30 Gurjeet Singh <gurjeet@singh.im>:
>> On Sun, Jun 30, 2013 at 11:46 AM, Pavel Stehule <pavel.stehule@gmail.com>
>> wrote:
>>>
>>> 2013/6/30 Gurjeet Singh <gurjeet@singh.im>:
>>> > On Sun, Jun 30, 2013 at 11:13 AM, Pavel Stehule
>>> > <pavel.stehule@gmail.com>
>>> > wrote:
>>> >
>>> > How about naming those 3 variables as follows:
>>> >
>>> > root_expr_kind
>>> > root_expr_name
>>> > root_bool_expr_type
>>>
>>> +1
>>
>>
>> Thanks. Attached is the patch with that change. I'll update the commitfest
>> entry with a link to this email.
>
> ok
>
> I chechecked it - patched without warnings, all tests passed
>
> It is ready for commit

I think it's a waste of code to try to handle bushy trees.  A list is
not a particularly efficient representation of the pending list; this
will probably be slower than recusing in the common case.  I'd suggest
keeping the logic to handle left-deep trees, which I find rather
elegant, but ditching the pending list.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: review: Non-recursive processing of AND/OR lists

От
Josh Berkus
Дата:
> I think it's a waste of code to try to handle bushy trees.  A list is
> not a particularly efficient representation of the pending list; this
> will probably be slower than recusing in the common case.  I'd suggest
> keeping the logic to handle left-deep trees, which I find rather
> elegant, but ditching the pending list.

Is there going to be further discussion of this patch, or do I return it?


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: review: Non-recursive processing of AND/OR lists

От
Robert Haas
Дата:
On Wed, Jul 10, 2013 at 9:02 PM, Josh Berkus <josh@agliodbs.com> wrote:
>> I think it's a waste of code to try to handle bushy trees.  A list is
>> not a particularly efficient representation of the pending list; this
>> will probably be slower than recusing in the common case.  I'd suggest
>> keeping the logic to handle left-deep trees, which I find rather
>> elegant, but ditching the pending list.
>
> Is there going to be further discussion of this patch, or do I return it?

Considering it's not been updated, nor my comments responded to, in
almost two weeks, I think we return it at this point.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: review: Non-recursive processing of AND/OR lists

От
Gurjeet Singh
Дата:
On Sun, Jul 14, 2013 at 8:27 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Jul 10, 2013 at 9:02 PM, Josh Berkus <josh@agliodbs.com> wrote:
>> I think it's a waste of code to try to handle bushy trees.  A list is
>> not a particularly efficient representation of the pending list; this
>> will probably be slower than recusing in the common case.  I'd suggest
>> keeping the logic to handle left-deep trees, which I find rather
>> elegant, but ditching the pending list.

Somehow I find it hard to believe that recursing would be more efficient than processing the items right there. The recursion is not direct either; transformExprRecurse() is going to call this function again, but after a few more switch-case comparisons.

Agreed that there's overhead in allocating list items, but is it more overhead than pushing functions on the call stack? Not sure, so I leave it to others who understand such things better than I do.

If by common-case you mean a list of just one logical AND/OR operator, then I agree that creating and destroying a list may incur overhead that is relatively very expensive. To that end, I have altered the patch, attached, to not build a pending list until we encounter a node with root_expr_kind in a right branch.

We're getting bushy-tree processing with very little extra code, but if you deem it not worthwhile or adding complexity, please feel free to rip it out.
 
>
> Is there going to be further discussion of this patch, or do I return it?

Considering it's not been updated, nor my comments responded to, in
almost two weeks, I think we return it at this point.

Sorry, I didn't notice that this patch was put back in  'Waiting on Author' state.

Best regards,
--
Gurjeet Singh

http://gurjeet.singh.im/

EnterpriseDB Inc.
Вложения

Re: review: Non-recursive processing of AND/OR lists

От
Pavel Stehule
Дата:
Hello

2013/7/15 Gurjeet Singh <gurjeet@singh.im>:
> On Sun, Jul 14, 2013 at 8:27 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>
>> On Wed, Jul 10, 2013 at 9:02 PM, Josh Berkus <josh@agliodbs.com> wrote:
>> >> I think it's a waste of code to try to handle bushy trees.  A list is
>> >> not a particularly efficient representation of the pending list; this
>> >> will probably be slower than recusing in the common case.  I'd suggest
>> >> keeping the logic to handle left-deep trees, which I find rather
>> >> elegant, but ditching the pending list.
>
>
> Somehow I find it hard to believe that recursing would be more efficient
> than processing the items right there. The recursion is not direct either;
> transformExprRecurse() is going to call this function again, but after a few
> more switch-case comparisons.
>
> Agreed that there's overhead in allocating list items, but is it more
> overhead than pushing functions on the call stack? Not sure, so I leave it
> to others who understand such things better than I do.
>
> If by common-case you mean a list of just one logical AND/OR operator, then
> I agree that creating and destroying a list may incur overhead that is
> relatively very expensive. To that end, I have altered the patch, attached,
> to not build a pending list until we encounter a node with root_expr_kind in
> a right branch.
>
> We're getting bushy-tree processing with very little extra code, but if you
> deem it not worthwhile or adding complexity, please feel free to rip it out.
>
>>
>> >
>> > Is there going to be further discussion of this patch, or do I return
>> > it?
>>
>> Considering it's not been updated, nor my comments responded to, in
>> almost two weeks, I think we return it at this point.
>
>
> Sorry, I didn't notice that this patch was put back in  'Waiting on Author'
> state.
>

I did a some performance tests of v5 and v6 version and there v5 is
little bit faster than v6, and v6 has significantly higher stddev

but I am not sure, if my test is correct - I tested a speed of EXPLAIN
statement - result was forwarded to /dev/null

Result of this test is probably related to tested pattern of
expressions - in this case "expr or expr or expr or expr or expr ... "

10 000 exprs (ms)
v |   avg   | stddev
---+---------+--------5 | 1839.14 |  13.686 | 1871.77 |  48.02

==v5 profile==
209064   43.5354  postgres                 equal
207849   43.2824  postgres                 process_equivalence
37453     7.7992  postgres                 datumIsEqual
3178      0.6618  postgres                 SearchCatCache
2350      0.4894  postgres                 AllocSetAlloc

==v6 profile==
193251   45.3998  postgres                 process_equivalence
178183   41.8599  postgres                 equal
30430     7.1488  postgres                 datumIsEqual
2819      0.6623  postgres                 SearchCatCache
1951      0.4583  postgres                 AllocSetAlloc


I found so 9.4 planner is about 1% slower (for test that sent by
Gurjeet), that  than 9.2 planner, but it is not related to this patch

v6 is clean and all regression tests was passed

Regards

Pavel

> Best regards,
>
> --
> Gurjeet Singh
>
> http://gurjeet.singh.im/
>
> EnterpriseDB Inc.



Re: review: Non-recursive processing of AND/OR lists

От
Gurjeet Singh
Дата:
On Tue, Jul 16, 2013 at 4:04 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
I did a some performance tests of v5 and v6 version and there v5 is
little bit faster than v6, and v6 has significantly higher stddev

Thanks Pavel.

The difference in average seems negligible, but stddev is interesting because v6 does less work than v5 in common cases and in the test that I had shared.

The current commitfest (2013-06) is marked as 'In Progress', so is it okay to just mark the patch as 'Ready for Committer' or should I move it to the next commitfest (2013-09).

What's the procedure of moving a patch to the next commitfest? Do I make a fresh submission there with a link to current submission, or is the move doable somehow in the application itself.

Best regards,
--
Gurjeet Singh

http://gurjeet.singh.im/

EnterpriseDB Inc.

Re: review: Non-recursive processing of AND/OR lists

От
Gurjeet Singh
Дата:
On Wed, Jul 17, 2013 at 8:21 AM, Gurjeet Singh <gurjeet@singh.im> wrote:

What's the procedure of moving a patch to the next commitfest?

Never mind, I see an email from Josh B. regarding this on my corporate account.

Best regards,
--
Gurjeet Singh

http://gurjeet.singh.im/

EnterpriseDB Inc.

Re: review: Non-recursive processing of AND/OR lists

От
Josh Berkus
Дата:
On 07/17/2013 05:21 AM, Gurjeet Singh wrote:
> On Tue, Jul 16, 2013 at 4:04 PM, Pavel Stehule <pavel.stehule@gmail.com>wrote:
> 
>> I did a some performance tests of v5 and v6 version and there v5 is
>> little bit faster than v6, and v6 has significantly higher stddev
>>
> 
> Thanks Pavel.
> 
> The difference in average seems negligible, but stddev is interesting
> because v6 does less work than v5 in common cases and in the test that I
> had shared.
> 
> The current commitfest (2013-06) is marked as 'In Progress', so is it okay
> to just mark the patch as 'Ready for Committer' or should I move it to the
> next commitfest (2013-09).

If this is actually "ready for committer", I'll mark it as such.


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: review: Non-recursive processing of AND/OR lists

От
Robert Haas
Дата:
On Mon, Jul 15, 2013 at 12:45 AM, Gurjeet Singh <gurjeet@singh.im> wrote:
> Agreed that there's overhead in allocating list items, but is it more
> overhead than pushing functions on the call stack? Not sure, so I leave it
> to others who understand such things better than I do.

If you think that a palloc can ever be cheaper that pushing a frame on
the callstack, you're wrong.  palloc is not some kind of an atomic
primitive.  It's implemented by the AllocSetAlloc function, and you're
going to have to push that function on the call stack, too, in order
to run it.

My main point here is that if the user writes a = 1 and b = 1 and c =
1 and d = 1, they're not going to end up with a bushy tree.  They're
going to end up with a tree that's only deep in one direction (left, I
guess) and that's the case we might want to consider optimizing.  To
obtain a bushy tree, they're going to have to write a  = 1 and (b = 1
and c = 1) and d = 1, or something like that, and I don't see why we
should stress out about that case.  It will be rare in practice.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: review: Non-recursive processing of AND/OR lists

От
Gurjeet Singh
Дата:
On Wed, Jul 17, 2013 at 1:25 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Jul 15, 2013 at 12:45 AM, Gurjeet Singh <gurjeet@singh.im> wrote:
> Agreed that there's overhead in allocating list items, but is it more
> overhead than pushing functions on the call stack? Not sure, so I leave it
> to others who understand such things better than I do.

If you think that a palloc can ever be cheaper that pushing a frame on
the callstack, you're wrong.  palloc is not some kind of an atomic
primitive.  It's implemented by the AllocSetAlloc function, and you're
going to have to push that function on the call stack, too, in order
to run it.

Agreed. I take my objection back. Even if AllocSetAlloc() reuses memory that was pfree'd earlier, it'll still be at least as expensive as recursing.
 

My main point here is that if the user writes a = 1 and b = 1 and c =
1 and d = 1, they're not going to end up with a bushy tree.  They're
going to end up with a tree that's only deep in one direction (left, I
guess) and that's the case we might want to consider optimizing.  To
obtain a bushy tree, they're going to have to write a  = 1 and (b = 1
and c = 1) and d = 1, or something like that, and I don't see why we
should stress out about that case.  It will be rare in practice.

In v6 of the  patch, I have deferred the 'pending' list initialization to until we actually hit a candidate right-branch. So in the common case the pending list will never be populated, and if we find a bushy or right-deep tree (for some reason an ORM/tool may choose to build AND/OR lists that may end being right-deep when in Postgres), then the pending list will be used to process them iteratively.

Does that alleviate your concern about 'pending' list management causing an overhead.

Agreed that bushy/right-deep trees are a remote corner case, but we are addressing a remote corner case in the first place (insanely long AND lists) and why not handle another remote corner case right now if it doesn't cause an overhead for common case.

Best regards,
--
Gurjeet Singh

http://gurjeet.singh.im/

EnterpriseDB Inc.

Re: review: Non-recursive processing of AND/OR lists

От
Robert Haas
Дата:
On Wed, Jul 17, 2013 at 2:03 PM, Gurjeet Singh <gurjeet@singh.im> wrote:
> In v6 of the  patch, I have deferred the 'pending' list initialization to
> until we actually hit a candidate right-branch. So in the common case the
> pending list will never be populated, and if we find a bushy or right-deep
> tree (for some reason an ORM/tool may choose to build AND/OR lists that may
> end being right-deep when in Postgres), then the pending list will be used
> to process them iteratively.
>
> Does that alleviate your concern about 'pending' list management causing an
> overhead.
>
> Agreed that bushy/right-deep trees are a remote corner case, but we are
> addressing a remote corner case in the first place (insanely long AND lists)
> and why not handle another remote corner case right now if it doesn't cause
> an overhead for common case.

Because simpler code is less likely to have bugs and is easier to
maintain.   It's worth noting that the change you're proposing is in
fact user-visible, as demonstrated by the fact that you had to update
the regression test output:

-                                 |   WHERE (((rsl.sl_color =
rsh.slcolor) AND (rsl.sl_len_cm >= rsh.slminlen_cm)) AND
(rsl.sl_len_cm <= rsh.slmaxlen_cm));
+                                 |   WHERE ((rsl.sl_color =
rsh.slcolor) AND (rsl.sl_len_cm >= rsh.slminlen_cm) AND (rsl.sl_len_cm
<= rsh.slmaxlen_cm));

Now, I think that change is actually an improvement, because here's
what that WHERE clause looked like when it was entered:
    WHERE rsl.sl_color = rsh.slcolor      AND rsl.sl_len_cm >= rsh.slminlen_cm      AND rsl.sl_len_cm <=
rsh.slmaxlen_cm;

But flattening a = 1 AND (b = 1 AND c = 1 AND d = 1) AND e = 1 to a
flat list doesn't have any of the same advantages.

At the end of the day, this is a judgement call, and I'm giving you
mine.  If somebody else wants to weigh in, that's fine.  I can't think
of anything that would actually be outright broken under your proposed
approach, but my personal feeling is that it's better to only add the
amount of code that we know is needed to solve the problem actually
observed in practice, and no more.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: review: Non-recursive processing of AND/OR lists

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Jul 17, 2013 at 2:03 PM, Gurjeet Singh <gurjeet@singh.im> wrote:
>> Agreed that bushy/right-deep trees are a remote corner case, but we are
>> addressing a remote corner case in the first place (insanely long AND lists)
>> and why not handle another remote corner case right now if it doesn't cause
>> an overhead for common case.

> Because simpler code is less likely to have bugs and is easier to
> maintain.

I agree with that point, but one should also remember Polya's Inventor's
Paradox: the more general problem may be easier to solve.  That is, if
done right, code that fully flattens an AND tree might actually be
simpler than code that does just a subset of the transformation.  The
current patch fails to meet this expectation, but maybe you just haven't
thought about it the right way.

My concerns about this patch have little to do with that, though, and
much more to do with the likelihood that it breaks some other piece of
code that is expecting AND/OR to be strictly binary operators, which
is what they've always been in parsetrees that haven't reached the
planner.  It doesn't appear to me that you've done any research on that
point whatsoever --- you have not even updated the comment for BoolExpr
(in primnodes.h) that this patch falsifies.

> It's worth noting that the change you're proposing is in
> fact user-visible, as demonstrated by the fact that you had to update
> the regression test output:

The point to worry about here is whether rule dump and reload is still
safe.  In particular, the logic in ruleutils.c for deciding whether it's
safe to omit parentheses has only really been thought about/tested for
the binary AND/OR case.  Although that code can dump N-way AND/OR
because it's also used to print post-planner expression trees in EXPLAIN,
that case has never been held to the standard of "is the parser
guaranteed to interpret this expression the same as before?".  Perhaps
it's fine, but has anyone looked at that issue?
        regards, tom lane



Re: review: Non-recursive processing of AND/OR lists

От
Gurjeet Singh
Дата:
On Thu, Jul 18, 2013 at 10:19 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Because simpler code is less likely to have bugs and is easier to
> maintain.

I agree with that point, but one should also remember Polya's Inventor's
Paradox: the more general problem may be easier to solve.  That is, if
done right, code that fully flattens an AND tree might actually be
simpler than code that does just a subset of the transformation.  The
current patch fails to meet this expectation,

The current patch does completely flatten any type of tree (left/right-deep or bushy) without recursing, and right-deep and bushy tree processing is what Robert is recommending to defer to recursive processing. Maybe I haven't considered a case where it doesn't flatten the tree; do you have an example in mind.
 
but maybe you just haven't
thought about it the right way.

My concerns about this patch have little to do with that, though, and
much more to do with the likelihood that it breaks some other piece of
code that is expecting AND/OR to be strictly binary operators, which
is what they've always been in parsetrees that haven't reached the
planner.  It doesn't appear to me that you've done any research on that
point whatsoever

No, I haven't, and I might not be able to research it for a few more weeks.
 
you have not even updated the comment for BoolExpr
(in primnodes.h) that this patch falsifies.

I will fix that.

Best regards,
--
Gurjeet Singh

http://gurjeet.singh.im/

EnterpriseDB Inc.

Re: review: Non-recursive processing of AND/OR lists

От
Gurjeet Singh
Дата:
On Thu, Jul 18, 2013 at 1:54 PM, Gurjeet Singh <gurjeet@singh.im> wrote:
On Thu, Jul 18, 2013 at 10:19 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Because simpler code is less likely to have bugs and is easier to
> maintain.

I agree with that point, but one should also remember Polya's Inventor's
Paradox: the more general problem may be easier to solve.  That is, if
done right, code that fully flattens an AND tree might actually be
simpler than code that does just a subset of the transformation.  The
current patch fails to meet this expectation,

The current patch does completely flatten any type of tree (left/right-deep or bushy) without recursing, and right-deep and bushy tree processing is what Robert is recommending to defer to recursive processing. Maybe I haven't considered a case where it doesn't flatten the tree; do you have an example in mind.
 
but maybe you just haven't
thought about it the right way.

I tried to eliminate the 'pending' list, but I don't see a way around it. We need temporary storage somewhere to store the branches encountered on the right; in recursion case the call stack was serving that purpose.
 

My concerns about this patch have little to do with that, though, and
much more to do with the likelihood that it breaks some other piece of
code that is expecting AND/OR to be strictly binary operators, which
is what they've always been in parsetrees that haven't reached the
planner.  It doesn't appear to me that you've done any research on that
point whatsoever

No, I haven't, and I might not be able to research it for a few more weeks.

There are about 30 files (including optimizer and executor) that match case-insensitive search for BoolExpr, and I scanned those for the usage of the member 'args'. All the instances where BoolExpr.args is being accessed, it's being treated as a null-terminated list. There's one exception that I could find, and it was in context of NOT expression: not_clause() in clauses.c.
 
 
you have not even updated the comment for BoolExpr
(in primnodes.h) that this patch falsifies.

I will fix that.

I think this line in that comment already covers the fact that in some "special" cases a BoolExpr may have more than 2 arguments.

"There are also a few special cases where more arguments can appear before optimization."

I have updated the comment nevertheless, and removed another comment in parse_expr.c that claimed to be the only place where a BoolExpr with more than 2 args is generated.

I have isolated the code for right-deep and bushy tree processing via the macro PROCESS_BUSHY_TREES. Also, I have shortened some variable names while retaining their meaning.

Please find the updated patch attached (based on master).

Best regards,
Вложения

Re: review: Non-recursive processing of AND/OR lists

От
Tom Lane
Дата:
Gurjeet Singh <gurjeet@singh.im> writes:
> I tried to eliminate the 'pending' list, but I don't see a way around it.
> We need temporary storage somewhere to store the branches encountered on
> the right; in recursion case the call stack was serving that purpose.

I still think we should fix this in the grammar, rather than introducing
complicated logic to try to get rid of the recursion later.  For example,
as attached.

The existing A_Expr representation of raw AND/OR nodes isn't conducive to
this, but it's not that hard to change it.  The attached patch chooses to
use BoolExpr as both the raw and transformed representation of AND/OR/NOT;
we could alternatively invent some new raw-parsetree node type, but I
don't see any advantage in that.

I continue to think that more thought is needed about downstream
processing.  For instance, at least the comment at the head of prepqual.c
is wrong now, and it's worth wondering whether the planner still needs to
worry about AND/OR flattening at all.  (It probably does, to deal with
view-flattening cases for example; but it's worth considering whether
anything could be saved if we stopped doing that.)

            regards, tom lane

diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index 1e48a7f..95f5dd2 100644
*** a/src/backend/nodes/nodeFuncs.c
--- b/src/backend/nodes/nodeFuncs.c
*************** raw_expression_tree_walker(Node *node,
*** 3047,3052 ****
--- 3047,3060 ----
                  /* operator name is deemed uninteresting */
              }
              break;
+         case T_BoolExpr:
+             {
+                 BoolExpr   *expr = (BoolExpr *) node;
+
+                 if (walker(expr->args, context))
+                     return true;
+             }
+             break;
          case T_ColumnRef:
              /* we assume the fields contain nothing interesting */
              break;
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 10e8139..cd4bce1 100644
*** a/src/backend/nodes/outfuncs.c
--- b/src/backend/nodes/outfuncs.c
*************** _outAExpr(StringInfo str, const A_Expr *
*** 2437,2451 ****
              appendStringInfoChar(str, ' ');
              WRITE_NODE_FIELD(name);
              break;
-         case AEXPR_AND:
-             appendStringInfoString(str, " AND");
-             break;
-         case AEXPR_OR:
-             appendStringInfoString(str, " OR");
-             break;
-         case AEXPR_NOT:
-             appendStringInfoString(str, " NOT");
-             break;
          case AEXPR_OP_ANY:
              appendStringInfoChar(str, ' ');
              WRITE_NODE_FIELD(name);
--- 2437,2442 ----
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 7b9895d..dd04b1a 100644
*** a/src/backend/parser/gram.y
--- b/src/backend/parser/gram.y
*************** static void insertSelectOptions(SelectSt
*** 151,156 ****
--- 151,159 ----
  static Node *makeSetOp(SetOperation op, bool all, Node *larg, Node *rarg);
  static Node *doNegate(Node *n, int location);
  static void doNegateFloat(Value *v);
+ static Node *makeAndExpr(Node *lexpr, Node *rexpr, int location);
+ static Node *makeOrExpr(Node *lexpr, Node *rexpr, int location);
+ static Node *makeNotExpr(Node *expr, int location);
  static Node *makeAArrayExpr(List *elements, int location);
  static Node *makeXmlExpr(XmlExprOp op, char *name, List *named_args,
                           List *args, int location);
*************** a_expr:        c_expr                                    { $$ = $1; }
*** 10849,10859 ****
                  { $$ = (Node *) makeA_Expr(AEXPR_OP, $2, $1, NULL, @2); }

              | a_expr AND a_expr
!                 { $$ = (Node *) makeA_Expr(AEXPR_AND, NIL, $1, $3, @2); }
              | a_expr OR a_expr
!                 { $$ = (Node *) makeA_Expr(AEXPR_OR, NIL, $1, $3, @2); }
              | NOT a_expr
!                 { $$ = (Node *) makeA_Expr(AEXPR_NOT, NIL, NULL, $2, @1); }

              | a_expr LIKE a_expr
                  { $$ = (Node *) makeSimpleA_Expr(AEXPR_OP, "~~", $1, $3, @2); }
--- 10852,10862 ----
                  { $$ = (Node *) makeA_Expr(AEXPR_OP, $2, $1, NULL, @2); }

              | a_expr AND a_expr
!                 { $$ = makeAndExpr($1, $3, @2); }
              | a_expr OR a_expr
!                 { $$ = makeOrExpr($1, $3, @2); }
              | NOT a_expr
!                 { $$ = makeNotExpr($2, @1); }

              | a_expr LIKE a_expr
                  { $$ = (Node *) makeSimpleA_Expr(AEXPR_OP, "~~", $1, $3, @2); }
*************** a_expr:        c_expr                                    { $$ = $1; }
*** 11022,11032 ****
                  }
              | a_expr IS NOT DISTINCT FROM a_expr        %prec IS
                  {
!                     $$ = (Node *) makeA_Expr(AEXPR_NOT, NIL, NULL,
!                                     (Node *) makeSimpleA_Expr(AEXPR_DISTINCT,
!                                                               "=", $1, $6, @2),
!                                              @2);
!
                  }
              | a_expr IS OF '(' type_list ')'            %prec IS
                  {
--- 11025,11033 ----
                  }
              | a_expr IS NOT DISTINCT FROM a_expr        %prec IS
                  {
!                     $$ = makeNotExpr((Node *) makeSimpleA_Expr(AEXPR_DISTINCT,
!                                                                "=", $1, $6, @2),
!                                      @2);
                  }
              | a_expr IS OF '(' type_list ')'            %prec IS
                  {
*************** a_expr:        c_expr                                    { $$ = $1; }
*** 11044,11086 ****
               */
              | a_expr BETWEEN opt_asymmetric b_expr AND b_expr        %prec BETWEEN
                  {
!                     $$ = (Node *) makeA_Expr(AEXPR_AND, NIL,
                          (Node *) makeSimpleA_Expr(AEXPR_OP, ">=", $1, $4, @2),
                          (Node *) makeSimpleA_Expr(AEXPR_OP, "<=", $1, $6, @2),
!                                              @2);
                  }
              | a_expr NOT BETWEEN opt_asymmetric b_expr AND b_expr    %prec BETWEEN
                  {
!                     $$ = (Node *) makeA_Expr(AEXPR_OR, NIL,
                          (Node *) makeSimpleA_Expr(AEXPR_OP, "<", $1, $5, @2),
                          (Node *) makeSimpleA_Expr(AEXPR_OP, ">", $1, $7, @2),
!                                              @2);
                  }
              | a_expr BETWEEN SYMMETRIC b_expr AND b_expr            %prec BETWEEN
                  {
!                     $$ = (Node *) makeA_Expr(AEXPR_OR, NIL,
!                         (Node *) makeA_Expr(AEXPR_AND, NIL,
                              (Node *) makeSimpleA_Expr(AEXPR_OP, ">=", $1, $4, @2),
                              (Node *) makeSimpleA_Expr(AEXPR_OP, "<=", $1, $6, @2),
!                                             @2),
!                         (Node *) makeA_Expr(AEXPR_AND, NIL,
                              (Node *) makeSimpleA_Expr(AEXPR_OP, ">=", $1, $6, @2),
                              (Node *) makeSimpleA_Expr(AEXPR_OP, "<=", $1, $4, @2),
!                                             @2),
!                                              @2);
                  }
              | a_expr NOT BETWEEN SYMMETRIC b_expr AND b_expr        %prec BETWEEN
                  {
!                     $$ = (Node *) makeA_Expr(AEXPR_AND, NIL,
!                         (Node *) makeA_Expr(AEXPR_OR, NIL,
                              (Node *) makeSimpleA_Expr(AEXPR_OP, "<", $1, $5, @2),
                              (Node *) makeSimpleA_Expr(AEXPR_OP, ">", $1, $7, @2),
!                                             @2),
!                         (Node *) makeA_Expr(AEXPR_OR, NIL,
                              (Node *) makeSimpleA_Expr(AEXPR_OP, "<", $1, $7, @2),
                              (Node *) makeSimpleA_Expr(AEXPR_OP, ">", $1, $5, @2),
!                                             @2),
!                                              @2);
                  }
              | a_expr IN_P in_expr
                  {
--- 11045,11087 ----
               */
              | a_expr BETWEEN opt_asymmetric b_expr AND b_expr        %prec BETWEEN
                  {
!                     $$ = makeAndExpr(
                          (Node *) makeSimpleA_Expr(AEXPR_OP, ">=", $1, $4, @2),
                          (Node *) makeSimpleA_Expr(AEXPR_OP, "<=", $1, $6, @2),
!                                      @2);
                  }
              | a_expr NOT BETWEEN opt_asymmetric b_expr AND b_expr    %prec BETWEEN
                  {
!                     $$ = makeOrExpr(
                          (Node *) makeSimpleA_Expr(AEXPR_OP, "<", $1, $5, @2),
                          (Node *) makeSimpleA_Expr(AEXPR_OP, ">", $1, $7, @2),
!                                     @2);
                  }
              | a_expr BETWEEN SYMMETRIC b_expr AND b_expr            %prec BETWEEN
                  {
!                     $$ = makeOrExpr(
!                           makeAndExpr(
                              (Node *) makeSimpleA_Expr(AEXPR_OP, ">=", $1, $4, @2),
                              (Node *) makeSimpleA_Expr(AEXPR_OP, "<=", $1, $6, @2),
!                                       @2),
!                           makeAndExpr(
                              (Node *) makeSimpleA_Expr(AEXPR_OP, ">=", $1, $6, @2),
                              (Node *) makeSimpleA_Expr(AEXPR_OP, "<=", $1, $4, @2),
!                                       @2),
!                                     @2);
                  }
              | a_expr NOT BETWEEN SYMMETRIC b_expr AND b_expr        %prec BETWEEN
                  {
!                     $$ = makeAndExpr(
!                            makeOrExpr(
                              (Node *) makeSimpleA_Expr(AEXPR_OP, "<", $1, $5, @2),
                              (Node *) makeSimpleA_Expr(AEXPR_OP, ">", $1, $7, @2),
!                                       @2),
!                            makeOrExpr(
                              (Node *) makeSimpleA_Expr(AEXPR_OP, "<", $1, $7, @2),
                              (Node *) makeSimpleA_Expr(AEXPR_OP, ">", $1, $5, @2),
!                                       @2),
!                                      @2);
                  }
              | a_expr IN_P in_expr
                  {
*************** a_expr:        c_expr                                    { $$ = $1; }
*** 11114,11120 ****
                          n->operName = list_make1(makeString("="));
                          n->location = @3;
                          /* Stick a NOT on top */
!                         $$ = (Node *) makeA_Expr(AEXPR_NOT, NIL, NULL, (Node *) n, @2);
                      }
                      else
                      {
--- 11115,11121 ----
                          n->operName = list_make1(makeString("="));
                          n->location = @3;
                          /* Stick a NOT on top */
!                         $$ = makeNotExpr((Node *) n, @2);
                      }
                      else
                      {
*************** a_expr:        c_expr                                    { $$ = $1; }
*** 11162,11171 ****
                  }
              | a_expr IS NOT DOCUMENT_P                %prec IS
                  {
!                     $$ = (Node *) makeA_Expr(AEXPR_NOT, NIL, NULL,
!                                              makeXmlExpr(IS_DOCUMENT, NULL, NIL,
!                                                          list_make1($1), @2),
!                                              @2);
                  }
          ;

--- 11163,11171 ----
                  }
              | a_expr IS NOT DOCUMENT_P                %prec IS
                  {
!                     $$ = makeNotExpr(makeXmlExpr(IS_DOCUMENT, NULL, NIL,
!                                                  list_make1($1), @2),
!                                      @2);
                  }
          ;

*************** b_expr:        c_expr
*** 11216,11223 ****
                  }
              | b_expr IS NOT DISTINCT FROM b_expr    %prec IS
                  {
!                     $$ = (Node *) makeA_Expr(AEXPR_NOT, NIL,
!                         NULL, (Node *) makeSimpleA_Expr(AEXPR_DISTINCT, "=", $1, $6, @2), @2);
                  }
              | b_expr IS OF '(' type_list ')'        %prec IS
                  {
--- 11216,11224 ----
                  }
              | b_expr IS NOT DISTINCT FROM b_expr    %prec IS
                  {
!                     $$ = makeNotExpr((Node *) makeSimpleA_Expr(AEXPR_DISTINCT,
!                                                                "=", $1, $6, @2),
!                                      @2);
                  }
              | b_expr IS OF '(' type_list ')'        %prec IS
                  {
*************** b_expr:        c_expr
*** 11234,11243 ****
                  }
              | b_expr IS NOT DOCUMENT_P                %prec IS
                  {
!                     $$ = (Node *) makeA_Expr(AEXPR_NOT, NIL, NULL,
!                                              makeXmlExpr(IS_DOCUMENT, NULL, NIL,
!                                                          list_make1($1), @2),
!                                              @2);
                  }
          ;

--- 11235,11243 ----
                  }
              | b_expr IS NOT DOCUMENT_P                %prec IS
                  {
!                     $$ = makeNotExpr(makeXmlExpr(IS_DOCUMENT, NULL, NIL,
!                                                  list_make1($1), @2),
!                                      @2);
                  }
          ;

*************** doNegateFloat(Value *v)
*** 13693,13698 ****
--- 13693,13738 ----
  }

  static Node *
+ makeAndExpr(Node *lexpr, Node *rexpr, int location)
+ {
+     /* Flatten "a AND b AND c ..." to a single BoolExpr on sight */
+     if (IsA(lexpr, BoolExpr))
+     {
+         BoolExpr *blexpr = (BoolExpr *) lexpr;
+
+         if (blexpr->boolop == AND_EXPR)
+         {
+             blexpr->args = lappend(blexpr->args, rexpr);
+             return (Node *) blexpr;
+         }
+     }
+     return (Node *) makeBoolExpr(AND_EXPR, list_make2(lexpr, rexpr), location);
+ }
+
+ static Node *
+ makeOrExpr(Node *lexpr, Node *rexpr, int location)
+ {
+     /* Flatten "a OR b OR c ..." to a single BoolExpr on sight */
+     if (IsA(lexpr, BoolExpr))
+     {
+         BoolExpr *blexpr = (BoolExpr *) lexpr;
+
+         if (blexpr->boolop == OR_EXPR)
+         {
+             blexpr->args = lappend(blexpr->args, rexpr);
+             return (Node *) blexpr;
+         }
+     }
+     return (Node *) makeBoolExpr(OR_EXPR, list_make2(lexpr, rexpr), location);
+ }
+
+ static Node *
+ makeNotExpr(Node *expr, int location)
+ {
+     return (Node *) makeBoolExpr(NOT_EXPR, list_make1(expr), location);
+ }
+
+ static Node *
  makeAArrayExpr(List *elements, int location)
  {
      A_ArrayExpr *n = makeNode(A_ArrayExpr);
diff --git a/src/backend/parser/parse_clause.c b/src/backend/parser/parse_clause.c
index aa704bb..a71f611 100644
*** a/src/backend/parser/parse_clause.c
--- b/src/backend/parser/parse_clause.c
*************** transformJoinUsingClause(ParseState *pst
*** 332,338 ****
                           RangeTblEntry *leftRTE, RangeTblEntry *rightRTE,
                           List *leftVars, List *rightVars)
  {
!     Node       *result = NULL;
      ListCell   *lvars,
                 *rvars;

--- 332,339 ----
                           RangeTblEntry *leftRTE, RangeTblEntry *rightRTE,
                           List *leftVars, List *rightVars)
  {
!     Node       *result;
!     List       *andargs = NIL;
      ListCell   *lvars,
                 *rvars;

*************** transformJoinUsingClause(ParseState *pst
*** 358,375 ****
                               copyObject(lvar), copyObject(rvar),
                               -1);

!         /* And combine into an AND clause, if multiple join columns */
!         if (result == NULL)
!             result = (Node *) e;
!         else
!         {
!             A_Expr       *a;
!
!             a = makeA_Expr(AEXPR_AND, NIL, result, (Node *) e, -1);
!             result = (Node *) a;
!         }
      }

      /*
       * Since the references are already Vars, and are certainly from the input
       * relations, we don't have to go through the same pushups that
--- 359,374 ----
                               copyObject(lvar), copyObject(rvar),
                               -1);

!         /* Prepare to combine into an AND clause, if multiple join columns */
!         andargs = lappend(andargs, e);
      }

+     /* Only need an AND if there's more than one join column */
+     if (list_length(andargs) == 1)
+         result = (Node *) linitial(andargs);
+     else
+         result = (Node *) makeBoolExpr(AND_EXPR, andargs, -1);
+
      /*
       * Since the references are already Vars, and are certainly from the input
       * relations, we don't have to go through the same pushups that
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 81c9338..cb76133 100644
*** a/src/backend/parser/parse_expr.c
--- b/src/backend/parser/parse_expr.c
*************** bool        Transform_null_equals = false;
*** 41,55 ****
  static Node *transformExprRecurse(ParseState *pstate, Node *expr);
  static Node *transformParamRef(ParseState *pstate, ParamRef *pref);
  static Node *transformAExprOp(ParseState *pstate, A_Expr *a);
- static Node *transformAExprAnd(ParseState *pstate, A_Expr *a);
- static Node *transformAExprOr(ParseState *pstate, A_Expr *a);
- static Node *transformAExprNot(ParseState *pstate, A_Expr *a);
  static Node *transformAExprOpAny(ParseState *pstate, A_Expr *a);
  static Node *transformAExprOpAll(ParseState *pstate, A_Expr *a);
  static Node *transformAExprDistinct(ParseState *pstate, A_Expr *a);
  static Node *transformAExprNullIf(ParseState *pstate, A_Expr *a);
  static Node *transformAExprOf(ParseState *pstate, A_Expr *a);
  static Node *transformAExprIn(ParseState *pstate, A_Expr *a);
  static Node *transformFuncCall(ParseState *pstate, FuncCall *fn);
  static Node *transformCaseExpr(ParseState *pstate, CaseExpr *c);
  static Node *transformSubLink(ParseState *pstate, SubLink *sublink);
--- 41,53 ----
  static Node *transformExprRecurse(ParseState *pstate, Node *expr);
  static Node *transformParamRef(ParseState *pstate, ParamRef *pref);
  static Node *transformAExprOp(ParseState *pstate, A_Expr *a);
  static Node *transformAExprOpAny(ParseState *pstate, A_Expr *a);
  static Node *transformAExprOpAll(ParseState *pstate, A_Expr *a);
  static Node *transformAExprDistinct(ParseState *pstate, A_Expr *a);
  static Node *transformAExprNullIf(ParseState *pstate, A_Expr *a);
  static Node *transformAExprOf(ParseState *pstate, A_Expr *a);
  static Node *transformAExprIn(ParseState *pstate, A_Expr *a);
+ static Node *transformBoolExpr(ParseState *pstate, BoolExpr *a);
  static Node *transformFuncCall(ParseState *pstate, FuncCall *fn);
  static Node *transformCaseExpr(ParseState *pstate, CaseExpr *c);
  static Node *transformSubLink(ParseState *pstate, SubLink *sublink);
*************** transformExprRecurse(ParseState *pstate,
*** 223,237 ****
                      case AEXPR_OP:
                          result = transformAExprOp(pstate, a);
                          break;
-                     case AEXPR_AND:
-                         result = transformAExprAnd(pstate, a);
-                         break;
-                     case AEXPR_OR:
-                         result = transformAExprOr(pstate, a);
-                         break;
-                     case AEXPR_NOT:
-                         result = transformAExprNot(pstate, a);
-                         break;
                      case AEXPR_OP_ANY:
                          result = transformAExprOpAny(pstate, a);
                          break;
--- 221,226 ----
*************** transformExprRecurse(ParseState *pstate,
*** 258,263 ****
--- 247,256 ----
                  break;
              }

+         case T_BoolExpr:
+             result = transformBoolExpr(pstate, (BoolExpr *) expr);
+             break;
+
          case T_FuncCall:
              result = transformFuncCall(pstate, (FuncCall *) expr);
              break;
*************** transformExprRecurse(ParseState *pstate,
*** 337,343 ****
          case T_DistinctExpr:
          case T_NullIfExpr:
          case T_ScalarArrayOpExpr:
-         case T_BoolExpr:
          case T_FieldSelect:
          case T_FieldStore:
          case T_RelabelType:
--- 330,335 ----
*************** transformAExprOp(ParseState *pstate, A_E
*** 919,964 ****
  }

  static Node *
- transformAExprAnd(ParseState *pstate, A_Expr *a)
- {
-     Node       *lexpr = transformExprRecurse(pstate, a->lexpr);
-     Node       *rexpr = transformExprRecurse(pstate, a->rexpr);
-
-     lexpr = coerce_to_boolean(pstate, lexpr, "AND");
-     rexpr = coerce_to_boolean(pstate, rexpr, "AND");
-
-     return (Node *) makeBoolExpr(AND_EXPR,
-                                  list_make2(lexpr, rexpr),
-                                  a->location);
- }
-
- static Node *
- transformAExprOr(ParseState *pstate, A_Expr *a)
- {
-     Node       *lexpr = transformExprRecurse(pstate, a->lexpr);
-     Node       *rexpr = transformExprRecurse(pstate, a->rexpr);
-
-     lexpr = coerce_to_boolean(pstate, lexpr, "OR");
-     rexpr = coerce_to_boolean(pstate, rexpr, "OR");
-
-     return (Node *) makeBoolExpr(OR_EXPR,
-                                  list_make2(lexpr, rexpr),
-                                  a->location);
- }
-
- static Node *
- transformAExprNot(ParseState *pstate, A_Expr *a)
- {
-     Node       *rexpr = transformExprRecurse(pstate, a->rexpr);
-
-     rexpr = coerce_to_boolean(pstate, rexpr, "NOT");
-
-     return (Node *) makeBoolExpr(NOT_EXPR,
-                                  list_make1(rexpr),
-                                  a->location);
- }
-
- static Node *
  transformAExprOpAny(ParseState *pstate, A_Expr *a)
  {
      Node       *lexpr = transformExprRecurse(pstate, a->lexpr);
--- 911,916 ----
*************** transformAExprIn(ParseState *pstate, A_E
*** 1238,1243 ****
--- 1190,1231 ----
  }

  static Node *
+ transformBoolExpr(ParseState *pstate, BoolExpr *a)
+ {
+     List       *args = NIL;
+     const char *opname;
+     ListCell   *lc;
+
+     switch (a->boolop)
+     {
+         case AND_EXPR:
+             opname = "AND";
+             break;
+         case OR_EXPR:
+             opname = "OR";
+             break;
+         case NOT_EXPR:
+             opname = "NOT";
+             break;
+         default:
+             elog(ERROR, "unrecognized boolop: %d", (int) a->boolop);
+             opname = NULL;        /* keep compiler quiet */
+             break;
+     }
+
+     foreach(lc, a->args)
+     {
+         Node       *arg = (Node *) lfirst(lc);
+
+         arg = transformExprRecurse(pstate, arg);
+         arg = coerce_to_boolean(pstate, arg, opname);
+         args = lappend(args, arg);
+     }
+
+     return (Node *) makeBoolExpr(a->boolop, args, a->location);
+ }
+
+ static Node *
  transformFuncCall(ParseState *pstate, FuncCall *fn)
  {
      List       *targs;
*************** make_row_comparison_op(ParseState *pstat
*** 2428,2437 ****
      /*
       * For = and <> cases, we just combine the pairwise operators with AND or
       * OR respectively.
-      *
-      * Note: this is presently the only place where the parser generates
-      * BoolExpr with more than two arguments.  Should be OK since the rest of
-      * the system thinks BoolExpr is N-argument anyway.
       */
      if (rctype == ROWCOMPARE_EQ)
          return (Node *) makeBoolExpr(AND_EXPR, opexprs, location);
--- 2416,2421 ----
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 18d4991..59916eb 100644
*** a/src/include/nodes/parsenodes.h
--- b/src/include/nodes/parsenodes.h
*************** typedef struct ParamRef
*** 225,233 ****
  typedef enum A_Expr_Kind
  {
      AEXPR_OP,                    /* normal operator */
-     AEXPR_AND,                    /* booleans - name field is unused */
-     AEXPR_OR,
-     AEXPR_NOT,
      AEXPR_OP_ANY,                /* scalar op ANY (array) */
      AEXPR_OP_ALL,                /* scalar op ALL (array) */
      AEXPR_DISTINCT,                /* IS DISTINCT FROM - name must be "=" */
--- 225,230 ----
diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h
index 9cce60b..ed73109 100644
*** a/src/include/nodes/primnodes.h
--- b/src/include/nodes/primnodes.h
*************** typedef struct ScalarArrayOpExpr
*** 458,469 ****
   * BoolExpr - expression node for the basic Boolean operators AND, OR, NOT
   *
   * Notice the arguments are given as a List.  For NOT, of course the list
!  * must always have exactly one element.  For AND and OR, the executor can
!  * handle any number of arguments.    The parser generally treats AND and OR
!  * as binary and so it typically only produces two-element lists, but the
!  * optimizer will flatten trees of AND and OR nodes to produce longer lists
!  * when possible.  There are also a few special cases where more arguments
!  * can appear before optimization.
   */
  typedef enum BoolExprType
  {
--- 458,465 ----
   * BoolExpr - expression node for the basic Boolean operators AND, OR, NOT
   *
   * Notice the arguments are given as a List.  For NOT, of course the list
!  * must always have exactly one element.  For AND and OR, there can be two
!  * or more arguments.
   */
  typedef enum BoolExprType
  {
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 6c51d0d..c3188a7 100644
*** a/src/test/regress/expected/rules.out
--- b/src/test/regress/expected/rules.out
*************** shoe_ready| SELECT rsh.shoename,
*** 2117,2123 ****
      int4smaller(rsh.sh_avail, rsl.sl_avail) AS total_avail
     FROM shoe rsh,
      shoelace rsl
!   WHERE (((rsl.sl_color = rsh.slcolor) AND (rsl.sl_len_cm >= rsh.slminlen_cm)) AND (rsl.sl_len_cm <=
rsh.slmaxlen_cm));
  shoelace| SELECT s.sl_name,
      s.sl_avail,
      s.sl_color,
--- 2117,2123 ----
      int4smaller(rsh.sh_avail, rsl.sl_avail) AS total_avail
     FROM shoe rsh,
      shoelace rsl
!   WHERE ((rsl.sl_color = rsh.slcolor) AND (rsl.sl_len_cm >= rsh.slminlen_cm) AND (rsl.sl_len_cm <= rsh.slmaxlen_cm));
  shoelace| SELECT s.sl_name,
      s.sl_avail,
      s.sl_color,

Re: review: Non-recursive processing of AND/OR lists

От
Tom Lane
Дата:
I wrote:
> Gurjeet Singh <gurjeet@singh.im> writes:
>> I tried to eliminate the 'pending' list, but I don't see a way around it.
>> We need temporary storage somewhere to store the branches encountered on
>> the right; in recursion case the call stack was serving that purpose.

> I still think we should fix this in the grammar, rather than introducing
> complicated logic to try to get rid of the recursion later.  For example,
> as attached.

I went looking for (and found) some additional obsoleted comments, and
convinced myself that ruleutils.c is okay as-is, and pushed this.
        regards, tom lane



Re: review: Non-recursive processing of AND/OR lists

От
Gurjeet Singh
Дата:
Thanks!

On Mon, Jun 16, 2014 at 3:58 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I wrote:
>> Gurjeet Singh <gurjeet@singh.im> writes:
>>> I tried to eliminate the 'pending' list, but I don't see a way around it.
>>> We need temporary storage somewhere to store the branches encountered on
>>> the right; in recursion case the call stack was serving that purpose.
>
>> I still think we should fix this in the grammar, rather than introducing
>> complicated logic to try to get rid of the recursion later.  For example,
>> as attached.
>
> I went looking for (and found) some additional obsoleted comments, and
> convinced myself that ruleutils.c is okay as-is, and pushed this.
>
>                         regards, tom lane



-- 
Gurjeet Singh http://gurjeet.singh.im/

EDB www.EnterpriseDB.com