Обсуждение: review: Non-recursive processing of AND/OR lists
Hello related to https://commitfest.postgresql.org/action/patch_view?id=1130 http://www.postgresql.org/message-id/CABwTF4V9rsjiBWE+87pK83Mmm7ACdrG7sZ08RQ-4qYMe8jvhbw@mail.gmail.com * motivation: remove recursive procession of AND/OR list (hangs with 10062 and more subexpressions) * patch is short, clean and respect postgresql source code requirements * patch was applied cleanly without warnings * all regression tests was passed * I successfully evaluated expression with 100000 subexpressions * there is no significant slowdown possible improvements a = (A_Expr*) list_nth(pending, 0); a = (A_Expr*) linitial(pending); not well comment should be -- "If the right branch is also an SAME condition, append it to the" + /* + * If the right branch is also an AND condition, append it to the + * pending list, to be processed later. This allows us to walk even + * bushy trees, not just left-deep trees. + */ + if (IsA(a->rexpr, A_Expr) && ((A_Expr*)a->rexpr)->kind == root_kind) + { + pending = lappend(pending, a->rexpr); + } + else + { + expr = transformExprRecurse(pstate, a->rexpr); + expr = coerce_to_boolean(pstate, expr, root_kind == AEXPR_AND ? "AND" : "OR"); + exprs = lcons(expr, exprs); + } I don't see any other issues, so after fixing comments this patch is ready for commit Regards Pavel Stehule
Thanks for the review Pavel.<br /><br /><div class="gmail_quote">On Tue, Jun 18, 2013 at 3:01 PM, Pavel Stehule <span dir="ltr"><<ahref="mailto:pavel.stehule@gmail.com" target="_blank">pavel.stehule@gmail.com</a>></span> wrote:<br /><blockquoteclass="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hello<br /><br />related to<br /><br /><a href="https://commitfest.postgresql.org/action/patch_view?id=1130" target="_blank">https://commitfest.postgresql.org/action/patch_view?id=1130</a><br/><a href="http://www.postgresql.org/message-id/CABwTF4V9rsjiBWE+87pK83Mmm7ACdrG7sZ08RQ-4qYMe8jvhbw@mail.gmail.com" target="_blank">http://www.postgresql.org/message-id/CABwTF4V9rsjiBWE+87pK83Mmm7ACdrG7sZ08RQ-4qYMe8jvhbw@mail.gmail.com</a><br /><br/><br /> * motivation: remove recursive procession of AND/OR list (hangs with<br /> 10062 and more subexpressions)<br/><br /> * patch is short, clean and respect postgresql source code requirements<br /> * patch was appliedcleanly without warnings<br /> * all regression tests was passed<br /> * I successfully evaluated expression with100000 subexpressions<br /> * there is no significant slowdown<br /><br /> possible improvements<br /><br /> a = (A_Expr*)list_nth(pending, 0);<br /><br /> a = (A_Expr*) linitial(pending);<br /><br /> not well comment<br /><br /> shouldbe -- "If the right branch is also an SAME condition, append it to the"<br /><br /> + /*<br />+ * If the right branch is also an AND condition, append it to the<br /> + * pending list, to be processed later. This allows us to walk even<br /> + * bushy trees, not justleft-deep trees.<br /> + */<br /> + if (IsA(a->rexpr, A_Expr) &&((A_Expr*)a->rexpr)->kind == root_kind)<br /> + {<br /> + pending = lappend(pending, a->rexpr);<br /> + }<br /> + else<br /> + {<br /> + expr = transformExprRecurse(pstate, a->rexpr);<br /> + expr = coerce_to_boolean(pstate, expr, root_kind == AEXPR_AND ?<br /> "AND" : "OR");<br />+ exprs = lcons(expr, exprs);<br /> + }<br /><br /> I don't see anyother issues, so after fixing comments this patch is<br /> ready for commit<br /><br /> Regards<br /><span class="HOEnZb"><fontcolor="#888888"><br /> Pavel Stehule<br /></font></span></blockquote></div><br /><br clear="all" /><br/>-- <br /><div dir="ltr">Gurjeet Singh<br /><br /><a href="http://gurjeet.singh.im/" target="_blank">http://gurjeet.singh.im/</a><br/><br />EnterpriseDB Inc.<br /></div>
On Tue, Jun 18, 2013 at 3:01 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
I made that change, hesitantly. The comments above definition of linitial() macro describe the confusion that API causes. I wanted to avoid that confusion for new code, so I used the newer API which makes the intention quite clear. But looking at that code closely, list_nth() causes at least 2 function calls, and that's pretty heavy compared to the linitiali() macro.
I moved that comment above the outer bock, so that the intention of the whole do-while code block is described in one place.
Thanks for the review Pavel.
Attached is the updated patch, v4. It has the above edits, and a few code improvements, like not repeating the (root_kind == AEPR_AND ? .. : ..) ternary expression.
Best regards,
--
related to
https://commitfest.postgresql.org/action/patch_view?id=1130
http://www.postgresql.org/message-id/CABwTF4V9rsjiBWE+87pK83Mmm7ACdrG7sZ08RQ-4qYMe8jvhbw@mail.gmail.com
* motivation: remove recursive procession of AND/OR list (hangs with
10062 and more subexpressions)
* patch is short, clean and respect postgresql source code requirements
* patch was applied cleanly without warnings
* all regression tests was passed
* I successfully evaluated expression with 100000 subexpressions
* there is no significant slowdown
possible improvements
a = (A_Expr*) list_nth(pending, 0);
a = (A_Expr*) linitial(pending);
I made that change, hesitantly. The comments above definition of linitial() macro describe the confusion that API causes. I wanted to avoid that confusion for new code, so I used the newer API which makes the intention quite clear. But looking at that code closely, list_nth() causes at least 2 function calls, and that's pretty heavy compared to the linitiali() macro.
not well comment
should be -- "If the right branch is also an SAME condition, append it to the"
I moved that comment above the outer bock, so that the intention of the whole do-while code block is described in one place.
I don't see any other issues, so after fixing comments this patch is
ready for commit
Thanks for the review Pavel.
Attached is the updated patch, v4. It has the above edits, and a few code improvements, like not repeating the (root_kind == AEPR_AND ? .. : ..) ternary expression.
Best regards,
Вложения
Hello just one small notices I dislike a name "root_bool_expr", because, there is not a expression, but expression type. Can you use "root_bool_expr_type" instead? It is little bit longer, but more correct. Same not best name is "root_char", maybe "root_bool_op_name" or root_expr_type and root_op_name ??? Have no other comments Regards Pavel 2013/6/30 Gurjeet Singh <gurjeet@singh.im>: > On Tue, Jun 18, 2013 at 3:01 PM, Pavel Stehule <pavel.stehule@gmail.com> > wrote: >> >> >> related to >> >> https://commitfest.postgresql.org/action/patch_view?id=1130 >> >> http://www.postgresql.org/message-id/CABwTF4V9rsjiBWE+87pK83Mmm7ACdrG7sZ08RQ-4qYMe8jvhbw@mail.gmail.com >> >> >> * motivation: remove recursive procession of AND/OR list (hangs with >> 10062 and more subexpressions) >> >> * patch is short, clean and respect postgresql source code requirements >> * patch was applied cleanly without warnings >> * all regression tests was passed >> * I successfully evaluated expression with 100000 subexpressions >> * there is no significant slowdown >> >> possible improvements >> >> a = (A_Expr*) list_nth(pending, 0); >> >> a = (A_Expr*) linitial(pending); > > > I made that change, hesitantly. The comments above definition of linitial() > macro describe the confusion that API causes. I wanted to avoid that > confusion for new code, so I used the newer API which makes the intention > quite clear. But looking at that code closely, list_nth() causes at least 2 > function calls, and that's pretty heavy compared to the linitiali() macro. > >> >> >> not well comment >> >> should be -- "If the right branch is also an SAME condition, append it to >> the" > > > I moved that comment above the outer bock, so that the intention of the > whole do-while code block is described in one place. > >> I don't see any other issues, so after fixing comments this patch is >> ready for commit > > > Thanks for the review Pavel. > > Attached is the updated patch, v4. It has the above edits, and a few code > improvements, like not repeating the (root_kind == AEPR_AND ? .. : ..) > ternary expression. > > Best regards, > -- > Gurjeet Singh > > http://gurjeet.singh.im/ > > EnterpriseDB Inc.
On Sun, Jun 30, 2013 at 11:13 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
How about naming those 3 variables as follows:
root_expr_kind
root_expr_name
root_bool_expr_type
Hello
just one small notices
I dislike a name "root_bool_expr", because, there is not a expression,
but expression type. Can you use "root_bool_expr_type" instead? It is
little bit longer, but more correct. Same not best name is
"root_char", maybe "root_bool_op_name"
or root_expr_type and root_op_name ???
How about naming those 3 variables as follows:
root_expr_kind
root_expr_name
root_bool_expr_type
--
2013/6/30 Gurjeet Singh <gurjeet@singh.im>: > On Sun, Jun 30, 2013 at 11:13 AM, Pavel Stehule <pavel.stehule@gmail.com> > wrote: >> >> Hello >> >> just one small notices >> >> I dislike a name "root_bool_expr", because, there is not a expression, >> but expression type. Can you use "root_bool_expr_type" instead? It is >> little bit longer, but more correct. Same not best name is >> "root_char", maybe "root_bool_op_name" >> >> or root_expr_type and root_op_name ??? > > > How about naming those 3 variables as follows: > > root_expr_kind > root_expr_name > root_bool_expr_type +1 Pavel > > > -- > Gurjeet Singh > > http://gurjeet.singh.im/ > > EnterpriseDB Inc.
On Sun, Jun 30, 2013 at 11:46 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
Thanks. Attached is the patch with that change. I'll update the commitfest entry with a link to this email.
2013/6/30 Gurjeet Singh <gurjeet@singh.im>:> On Sun, Jun 30, 2013 at 11:13 AM, Pavel Stehule <pavel.stehule@gmail.com>+1
> wrote:
>
> How about naming those 3 variables as follows:
>
> root_expr_kind
> root_expr_name
> root_bool_expr_type
Thanks. Attached is the patch with that change. I'll update the commitfest entry with a link to this email.
--
On Sun, Jun 30, 2013 at 11:46 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
2013/6/30 Gurjeet Singh <gurjeet@singh.im>:+1> How about naming those 3 variables as follows:
>
> root_expr_kind
> root_expr_name
> root_bool_expr_type
Thanks. Attached is the patch with that change. I'll update the commitfest entry with a link to this email.
--
Вложения
2013/6/30 Gurjeet Singh <gurjeet@singh.im>: > On Sun, Jun 30, 2013 at 11:46 AM, Pavel Stehule <pavel.stehule@gmail.com> > wrote: >> >> 2013/6/30 Gurjeet Singh <gurjeet@singh.im>: >> > On Sun, Jun 30, 2013 at 11:13 AM, Pavel Stehule >> > <pavel.stehule@gmail.com> >> > wrote: >> > >> > How about naming those 3 variables as follows: >> > >> > root_expr_kind >> > root_expr_name >> > root_bool_expr_type >> >> +1 > > > Thanks. Attached is the patch with that change. I'll update the commitfest > entry with a link to this email. ok I chechecked it - patched without warnings, all tests passed It is ready for commit Regards Pavel > > -- > Gurjeet Singh > > http://gurjeet.singh.im/ > > EnterpriseDB Inc.
On Sun, Jun 30, 2013 at 1:08 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote: > 2013/6/30 Gurjeet Singh <gurjeet@singh.im>: >> On Sun, Jun 30, 2013 at 11:46 AM, Pavel Stehule <pavel.stehule@gmail.com> >> wrote: >>> >>> 2013/6/30 Gurjeet Singh <gurjeet@singh.im>: >>> > On Sun, Jun 30, 2013 at 11:13 AM, Pavel Stehule >>> > <pavel.stehule@gmail.com> >>> > wrote: >>> > >>> > How about naming those 3 variables as follows: >>> > >>> > root_expr_kind >>> > root_expr_name >>> > root_bool_expr_type >>> >>> +1 >> >> >> Thanks. Attached is the patch with that change. I'll update the commitfest >> entry with a link to this email. > > ok > > I chechecked it - patched without warnings, all tests passed > > It is ready for commit I think it's a waste of code to try to handle bushy trees. A list is not a particularly efficient representation of the pending list; this will probably be slower than recusing in the common case. I'd suggest keeping the logic to handle left-deep trees, which I find rather elegant, but ditching the pending list. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
> I think it's a waste of code to try to handle bushy trees. A list is > not a particularly efficient representation of the pending list; this > will probably be slower than recusing in the common case. I'd suggest > keeping the logic to handle left-deep trees, which I find rather > elegant, but ditching the pending list. Is there going to be further discussion of this patch, or do I return it? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On Wed, Jul 10, 2013 at 9:02 PM, Josh Berkus <josh@agliodbs.com> wrote: >> I think it's a waste of code to try to handle bushy trees. A list is >> not a particularly efficient representation of the pending list; this >> will probably be slower than recusing in the common case. I'd suggest >> keeping the logic to handle left-deep trees, which I find rather >> elegant, but ditching the pending list. > > Is there going to be further discussion of this patch, or do I return it? Considering it's not been updated, nor my comments responded to, in almost two weeks, I think we return it at this point. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Sun, Jul 14, 2013 at 8:27 PM, Robert Haas <robertmhaas@gmail.com> wrote:
Somehow I find it hard to believe that recursing would be more efficient than processing the items right there. The recursion is not direct either; transformExprRecurse() is going to call this function again, but after a few more switch-case comparisons.
Agreed that there's overhead in allocating list items, but is it more overhead than pushing functions on the call stack? Not sure, so I leave it to others who understand such things better than I do.
If by common-case you mean a list of just one logical AND/OR operator, then I agree that creating and destroying a list may incur overhead that is relatively very expensive. To that end, I have altered the patch, attached, to not build a pending list until we encounter a node with root_expr_kind in a right branch.
We're getting bushy-tree processing with very little extra code, but if you deem it not worthwhile or adding complexity, please feel free to rip it out.
Sorry, I didn't notice that this patch was put back in 'Waiting on Author' state.
On Wed, Jul 10, 2013 at 9:02 PM, Josh Berkus <josh@agliodbs.com> wrote:
>> I think it's a waste of code to try to handle bushy trees. A list is
>> not a particularly efficient representation of the pending list; this
>> will probably be slower than recusing in the common case. I'd suggest
>> keeping the logic to handle left-deep trees, which I find rather
>> elegant, but ditching the pending list.
Somehow I find it hard to believe that recursing would be more efficient than processing the items right there. The recursion is not direct either; transformExprRecurse() is going to call this function again, but after a few more switch-case comparisons.
Agreed that there's overhead in allocating list items, but is it more overhead than pushing functions on the call stack? Not sure, so I leave it to others who understand such things better than I do.
If by common-case you mean a list of just one logical AND/OR operator, then I agree that creating and destroying a list may incur overhead that is relatively very expensive. To that end, I have altered the patch, attached, to not build a pending list until we encounter a node with root_expr_kind in a right branch.
We're getting bushy-tree processing with very little extra code, but if you deem it not worthwhile or adding complexity, please feel free to rip it out.
>Considering it's not been updated, nor my comments responded to, in
> Is there going to be further discussion of this patch, or do I return it?
almost two weeks, I think we return it at this point.
Sorry, I didn't notice that this patch was put back in 'Waiting on Author' state.
Best regards,
--
Вложения
Hello 2013/7/15 Gurjeet Singh <gurjeet@singh.im>: > On Sun, Jul 14, 2013 at 8:27 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> >> On Wed, Jul 10, 2013 at 9:02 PM, Josh Berkus <josh@agliodbs.com> wrote: >> >> I think it's a waste of code to try to handle bushy trees. A list is >> >> not a particularly efficient representation of the pending list; this >> >> will probably be slower than recusing in the common case. I'd suggest >> >> keeping the logic to handle left-deep trees, which I find rather >> >> elegant, but ditching the pending list. > > > Somehow I find it hard to believe that recursing would be more efficient > than processing the items right there. The recursion is not direct either; > transformExprRecurse() is going to call this function again, but after a few > more switch-case comparisons. > > Agreed that there's overhead in allocating list items, but is it more > overhead than pushing functions on the call stack? Not sure, so I leave it > to others who understand such things better than I do. > > If by common-case you mean a list of just one logical AND/OR operator, then > I agree that creating and destroying a list may incur overhead that is > relatively very expensive. To that end, I have altered the patch, attached, > to not build a pending list until we encounter a node with root_expr_kind in > a right branch. > > We're getting bushy-tree processing with very little extra code, but if you > deem it not worthwhile or adding complexity, please feel free to rip it out. > >> >> > >> > Is there going to be further discussion of this patch, or do I return >> > it? >> >> Considering it's not been updated, nor my comments responded to, in >> almost two weeks, I think we return it at this point. > > > Sorry, I didn't notice that this patch was put back in 'Waiting on Author' > state. > I did a some performance tests of v5 and v6 version and there v5 is little bit faster than v6, and v6 has significantly higher stddev but I am not sure, if my test is correct - I tested a speed of EXPLAIN statement - result was forwarded to /dev/null Result of this test is probably related to tested pattern of expressions - in this case "expr or expr or expr or expr or expr ... " 10 000 exprs (ms) v | avg | stddev ---+---------+--------5 | 1839.14 | 13.686 | 1871.77 | 48.02 ==v5 profile== 209064 43.5354 postgres equal 207849 43.2824 postgres process_equivalence 37453 7.7992 postgres datumIsEqual 3178 0.6618 postgres SearchCatCache 2350 0.4894 postgres AllocSetAlloc ==v6 profile== 193251 45.3998 postgres process_equivalence 178183 41.8599 postgres equal 30430 7.1488 postgres datumIsEqual 2819 0.6623 postgres SearchCatCache 1951 0.4583 postgres AllocSetAlloc I found so 9.4 planner is about 1% slower (for test that sent by Gurjeet), that than 9.2 planner, but it is not related to this patch v6 is clean and all regression tests was passed Regards Pavel > Best regards, > > -- > Gurjeet Singh > > http://gurjeet.singh.im/ > > EnterpriseDB Inc.
On Tue, Jul 16, 2013 at 4:04 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
Thanks Pavel.
The difference in average seems negligible, but stddev is interesting because v6 does less work than v5 in common cases and in the test that I had shared.
The current commitfest (2013-06) is marked as 'In Progress', so is it okay to just mark the patch as 'Ready for Committer' or should I move it to the next commitfest (2013-09).
What's the procedure of moving a patch to the next commitfest? Do I make a fresh submission there with a link to current submission, or is the move doable somehow in the application itself.
Best regards,
-- I did a some performance tests of v5 and v6 version and there v5 is
little bit faster than v6, and v6 has significantly higher stddev
Thanks Pavel.
The difference in average seems negligible, but stddev is interesting because v6 does less work than v5 in common cases and in the test that I had shared.
The current commitfest (2013-06) is marked as 'In Progress', so is it okay to just mark the patch as 'Ready for Committer' or should I move it to the next commitfest (2013-09).
What's the procedure of moving a patch to the next commitfest? Do I make a fresh submission there with a link to current submission, or is the move doable somehow in the application itself.
Best regards,
On Wed, Jul 17, 2013 at 8:21 AM, Gurjeet Singh <gurjeet@singh.im> wrote:
Never mind, I see an email from Josh B. regarding this on my corporate account.
What's the procedure of moving a patch to the next commitfest?
Never mind, I see an email from Josh B. regarding this on my corporate account.
Best regards,
--
On 07/17/2013 05:21 AM, Gurjeet Singh wrote: > On Tue, Jul 16, 2013 at 4:04 PM, Pavel Stehule <pavel.stehule@gmail.com>wrote: > >> I did a some performance tests of v5 and v6 version and there v5 is >> little bit faster than v6, and v6 has significantly higher stddev >> > > Thanks Pavel. > > The difference in average seems negligible, but stddev is interesting > because v6 does less work than v5 in common cases and in the test that I > had shared. > > The current commitfest (2013-06) is marked as 'In Progress', so is it okay > to just mark the patch as 'Ready for Committer' or should I move it to the > next commitfest (2013-09). If this is actually "ready for committer", I'll mark it as such. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On Mon, Jul 15, 2013 at 12:45 AM, Gurjeet Singh <gurjeet@singh.im> wrote: > Agreed that there's overhead in allocating list items, but is it more > overhead than pushing functions on the call stack? Not sure, so I leave it > to others who understand such things better than I do. If you think that a palloc can ever be cheaper that pushing a frame on the callstack, you're wrong. palloc is not some kind of an atomic primitive. It's implemented by the AllocSetAlloc function, and you're going to have to push that function on the call stack, too, in order to run it. My main point here is that if the user writes a = 1 and b = 1 and c = 1 and d = 1, they're not going to end up with a bushy tree. They're going to end up with a tree that's only deep in one direction (left, I guess) and that's the case we might want to consider optimizing. To obtain a bushy tree, they're going to have to write a = 1 and (b = 1 and c = 1) and d = 1, or something like that, and I don't see why we should stress out about that case. It will be rare in practice. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Jul 17, 2013 at 1:25 PM, Robert Haas <robertmhaas@gmail.com> wrote:
Agreed. I take my objection back. Even if AllocSetAlloc() reuses memory that was pfree'd earlier, it'll still be at least as expensive as recursing.
In v6 of the patch, I have deferred the 'pending' list initialization to until we actually hit a candidate right-branch. So in the common case the pending list will never be populated, and if we find a bushy or right-deep tree (for some reason an ORM/tool may choose to build AND/OR lists that may end being right-deep when in Postgres), then the pending list will be used to process them iteratively.
Does that alleviate your concern about 'pending' list management causing an overhead.
Agreed that bushy/right-deep trees are a remote corner case, but we are addressing a remote corner case in the first place (insanely long AND lists) and why not handle another remote corner case right now if it doesn't cause an overhead for common case.
On Mon, Jul 15, 2013 at 12:45 AM, Gurjeet Singh <gurjeet@singh.im> wrote:If you think that a palloc can ever be cheaper that pushing a frame on
> Agreed that there's overhead in allocating list items, but is it more
> overhead than pushing functions on the call stack? Not sure, so I leave it
> to others who understand such things better than I do.
the callstack, you're wrong. palloc is not some kind of an atomic
primitive. It's implemented by the AllocSetAlloc function, and you're
going to have to push that function on the call stack, too, in order
to run it.
Agreed. I take my objection back. Even if AllocSetAlloc() reuses memory that was pfree'd earlier, it'll still be at least as expensive as recursing.
My main point here is that if the user writes a = 1 and b = 1 and c =
1 and d = 1, they're not going to end up with a bushy tree. They're
going to end up with a tree that's only deep in one direction (left, I
guess) and that's the case we might want to consider optimizing. To
obtain a bushy tree, they're going to have to write a = 1 and (b = 1
and c = 1) and d = 1, or something like that, and I don't see why we
should stress out about that case. It will be rare in practice.
In v6 of the patch, I have deferred the 'pending' list initialization to until we actually hit a candidate right-branch. So in the common case the pending list will never be populated, and if we find a bushy or right-deep tree (for some reason an ORM/tool may choose to build AND/OR lists that may end being right-deep when in Postgres), then the pending list will be used to process them iteratively.
Does that alleviate your concern about 'pending' list management causing an overhead.
Agreed that bushy/right-deep trees are a remote corner case, but we are addressing a remote corner case in the first place (insanely long AND lists) and why not handle another remote corner case right now if it doesn't cause an overhead for common case.
Best regards,
--
On Wed, Jul 17, 2013 at 2:03 PM, Gurjeet Singh <gurjeet@singh.im> wrote: > In v6 of the patch, I have deferred the 'pending' list initialization to > until we actually hit a candidate right-branch. So in the common case the > pending list will never be populated, and if we find a bushy or right-deep > tree (for some reason an ORM/tool may choose to build AND/OR lists that may > end being right-deep when in Postgres), then the pending list will be used > to process them iteratively. > > Does that alleviate your concern about 'pending' list management causing an > overhead. > > Agreed that bushy/right-deep trees are a remote corner case, but we are > addressing a remote corner case in the first place (insanely long AND lists) > and why not handle another remote corner case right now if it doesn't cause > an overhead for common case. Because simpler code is less likely to have bugs and is easier to maintain. It's worth noting that the change you're proposing is in fact user-visible, as demonstrated by the fact that you had to update the regression test output: - | WHERE (((rsl.sl_color = rsh.slcolor) AND (rsl.sl_len_cm >= rsh.slminlen_cm)) AND (rsl.sl_len_cm <= rsh.slmaxlen_cm)); + | WHERE ((rsl.sl_color = rsh.slcolor) AND (rsl.sl_len_cm >= rsh.slminlen_cm) AND (rsl.sl_len_cm <= rsh.slmaxlen_cm)); Now, I think that change is actually an improvement, because here's what that WHERE clause looked like when it was entered: WHERE rsl.sl_color = rsh.slcolor AND rsl.sl_len_cm >= rsh.slminlen_cm AND rsl.sl_len_cm <= rsh.slmaxlen_cm; But flattening a = 1 AND (b = 1 AND c = 1 AND d = 1) AND e = 1 to a flat list doesn't have any of the same advantages. At the end of the day, this is a judgement call, and I'm giving you mine. If somebody else wants to weigh in, that's fine. I can't think of anything that would actually be outright broken under your proposed approach, but my personal feeling is that it's better to only add the amount of code that we know is needed to solve the problem actually observed in practice, and no more. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Wed, Jul 17, 2013 at 2:03 PM, Gurjeet Singh <gurjeet@singh.im> wrote: >> Agreed that bushy/right-deep trees are a remote corner case, but we are >> addressing a remote corner case in the first place (insanely long AND lists) >> and why not handle another remote corner case right now if it doesn't cause >> an overhead for common case. > Because simpler code is less likely to have bugs and is easier to > maintain. I agree with that point, but one should also remember Polya's Inventor's Paradox: the more general problem may be easier to solve. That is, if done right, code that fully flattens an AND tree might actually be simpler than code that does just a subset of the transformation. The current patch fails to meet this expectation, but maybe you just haven't thought about it the right way. My concerns about this patch have little to do with that, though, and much more to do with the likelihood that it breaks some other piece of code that is expecting AND/OR to be strictly binary operators, which is what they've always been in parsetrees that haven't reached the planner. It doesn't appear to me that you've done any research on that point whatsoever --- you have not even updated the comment for BoolExpr (in primnodes.h) that this patch falsifies. > It's worth noting that the change you're proposing is in > fact user-visible, as demonstrated by the fact that you had to update > the regression test output: The point to worry about here is whether rule dump and reload is still safe. In particular, the logic in ruleutils.c for deciding whether it's safe to omit parentheses has only really been thought about/tested for the binary AND/OR case. Although that code can dump N-way AND/OR because it's also used to print post-planner expression trees in EXPLAIN, that case has never been held to the standard of "is the parser guaranteed to interpret this expression the same as before?". Perhaps it's fine, but has anyone looked at that issue? regards, tom lane
On Thu, Jul 18, 2013 at 10:19 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
The current patch does completely flatten any type of tree (left/right-deep or bushy) without recursing, and right-deep and bushy tree processing is what Robert is recommending to defer to recursive processing. Maybe I haven't considered a case where it doesn't flatten the tree; do you have an example in mind.
No, I haven't, and I might not be able to research it for a few more weeks.
I will fix that.
Best regards,
-- > Because simpler code is less likely to have bugs and is easier toI agree with that point, but one should also remember Polya's Inventor's
> maintain.
Paradox: the more general problem may be easier to solve. That is, if
done right, code that fully flattens an AND tree might actually be
simpler than code that does just a subset of the transformation. The
current patch fails to meet this expectation,
The current patch does completely flatten any type of tree (left/right-deep or bushy) without recursing, and right-deep and bushy tree processing is what Robert is recommending to defer to recursive processing. Maybe I haven't considered a case where it doesn't flatten the tree; do you have an example in mind.
but maybe you just haven't
thought about it the right way.
My concerns about this patch have little to do with that, though, and
much more to do with the likelihood that it breaks some other piece of
code that is expecting AND/OR to be strictly binary operators, which
is what they've always been in parsetrees that haven't reached the
planner. It doesn't appear to me that you've done any research on that
point whatsoever
No, I haven't, and I might not be able to research it for a few more weeks.
you have not even updated the comment for BoolExpr
(in primnodes.h) that this patch falsifies.
I will fix that.
Best regards,
On Thu, Jul 18, 2013 at 1:54 PM, Gurjeet Singh <gurjeet@singh.im> wrote:
On Thu, Jul 18, 2013 at 10:19 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:> Because simpler code is less likely to have bugs and is easier toI agree with that point, but one should also remember Polya's Inventor's
> maintain.
Paradox: the more general problem may be easier to solve. That is, if
done right, code that fully flattens an AND tree might actually be
simpler than code that does just a subset of the transformation. The
current patch fails to meet this expectation,
The current patch does completely flatten any type of tree (left/right-deep or bushy) without recursing, and right-deep and bushy tree processing is what Robert is recommending to defer to recursive processing. Maybe I haven't considered a case where it doesn't flatten the tree; do you have an example in mind.
but maybe you just haven't
thought about it the right way.
I tried to eliminate the 'pending' list, but I don't see a way around it. We need temporary storage somewhere to store the branches encountered on the right; in recursion case the call stack was serving that purpose.
My concerns about this patch have little to do with that, though, and
much more to do with the likelihood that it breaks some other piece of
code that is expecting AND/OR to be strictly binary operators, which
is what they've always been in parsetrees that haven't reached the
planner. It doesn't appear to me that you've done any research on that
point whatsoever
No, I haven't, and I might not be able to research it for a few more weeks.
There are about 30 files (including optimizer and executor) that match case-insensitive search for BoolExpr, and I scanned those for the usage of the member 'args'. All the instances where BoolExpr.args is being accessed, it's being treated as a null-terminated list. There's one exception that I could find, and it was in context of NOT expression: not_clause() in clauses.c.
you have not even updated the comment for BoolExpr
(in primnodes.h) that this patch falsifies.
I will fix that.
I think this line in that comment already covers the fact that in some "special" cases a BoolExpr may have more than 2 arguments.
"There are also a few special cases where more arguments can appear before optimization."
"There are also a few special cases where more arguments can appear before optimization."
I have updated the comment nevertheless, and removed another comment in parse_expr.c that claimed to be the only place where a BoolExpr with more than 2 args is generated.
I have isolated the code for right-deep and bushy tree processing via the macro PROCESS_BUSHY_TREES. Also, I have shortened some variable names while retaining their meaning.
Please find the updated patch attached (based on master).
Best regards,
--
Вложения
Gurjeet Singh <gurjeet@singh.im> writes: > I tried to eliminate the 'pending' list, but I don't see a way around it. > We need temporary storage somewhere to store the branches encountered on > the right; in recursion case the call stack was serving that purpose. I still think we should fix this in the grammar, rather than introducing complicated logic to try to get rid of the recursion later. For example, as attached. The existing A_Expr representation of raw AND/OR nodes isn't conducive to this, but it's not that hard to change it. The attached patch chooses to use BoolExpr as both the raw and transformed representation of AND/OR/NOT; we could alternatively invent some new raw-parsetree node type, but I don't see any advantage in that. I continue to think that more thought is needed about downstream processing. For instance, at least the comment at the head of prepqual.c is wrong now, and it's worth wondering whether the planner still needs to worry about AND/OR flattening at all. (It probably does, to deal with view-flattening cases for example; but it's worth considering whether anything could be saved if we stopped doing that.) regards, tom lane diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c index 1e48a7f..95f5dd2 100644 *** a/src/backend/nodes/nodeFuncs.c --- b/src/backend/nodes/nodeFuncs.c *************** raw_expression_tree_walker(Node *node, *** 3047,3052 **** --- 3047,3060 ---- /* operator name is deemed uninteresting */ } break; + case T_BoolExpr: + { + BoolExpr *expr = (BoolExpr *) node; + + if (walker(expr->args, context)) + return true; + } + break; case T_ColumnRef: /* we assume the fields contain nothing interesting */ break; diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c index 10e8139..cd4bce1 100644 *** a/src/backend/nodes/outfuncs.c --- b/src/backend/nodes/outfuncs.c *************** _outAExpr(StringInfo str, const A_Expr * *** 2437,2451 **** appendStringInfoChar(str, ' '); WRITE_NODE_FIELD(name); break; - case AEXPR_AND: - appendStringInfoString(str, " AND"); - break; - case AEXPR_OR: - appendStringInfoString(str, " OR"); - break; - case AEXPR_NOT: - appendStringInfoString(str, " NOT"); - break; case AEXPR_OP_ANY: appendStringInfoChar(str, ' '); WRITE_NODE_FIELD(name); --- 2437,2442 ---- diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y index 7b9895d..dd04b1a 100644 *** a/src/backend/parser/gram.y --- b/src/backend/parser/gram.y *************** static void insertSelectOptions(SelectSt *** 151,156 **** --- 151,159 ---- static Node *makeSetOp(SetOperation op, bool all, Node *larg, Node *rarg); static Node *doNegate(Node *n, int location); static void doNegateFloat(Value *v); + static Node *makeAndExpr(Node *lexpr, Node *rexpr, int location); + static Node *makeOrExpr(Node *lexpr, Node *rexpr, int location); + static Node *makeNotExpr(Node *expr, int location); static Node *makeAArrayExpr(List *elements, int location); static Node *makeXmlExpr(XmlExprOp op, char *name, List *named_args, List *args, int location); *************** a_expr: c_expr { $$ = $1; } *** 10849,10859 **** { $$ = (Node *) makeA_Expr(AEXPR_OP, $2, $1, NULL, @2); } | a_expr AND a_expr ! { $$ = (Node *) makeA_Expr(AEXPR_AND, NIL, $1, $3, @2); } | a_expr OR a_expr ! { $$ = (Node *) makeA_Expr(AEXPR_OR, NIL, $1, $3, @2); } | NOT a_expr ! { $$ = (Node *) makeA_Expr(AEXPR_NOT, NIL, NULL, $2, @1); } | a_expr LIKE a_expr { $$ = (Node *) makeSimpleA_Expr(AEXPR_OP, "~~", $1, $3, @2); } --- 10852,10862 ---- { $$ = (Node *) makeA_Expr(AEXPR_OP, $2, $1, NULL, @2); } | a_expr AND a_expr ! { $$ = makeAndExpr($1, $3, @2); } | a_expr OR a_expr ! { $$ = makeOrExpr($1, $3, @2); } | NOT a_expr ! { $$ = makeNotExpr($2, @1); } | a_expr LIKE a_expr { $$ = (Node *) makeSimpleA_Expr(AEXPR_OP, "~~", $1, $3, @2); } *************** a_expr: c_expr { $$ = $1; } *** 11022,11032 **** } | a_expr IS NOT DISTINCT FROM a_expr %prec IS { ! $$ = (Node *) makeA_Expr(AEXPR_NOT, NIL, NULL, ! (Node *) makeSimpleA_Expr(AEXPR_DISTINCT, ! "=", $1, $6, @2), ! @2); ! } | a_expr IS OF '(' type_list ')' %prec IS { --- 11025,11033 ---- } | a_expr IS NOT DISTINCT FROM a_expr %prec IS { ! $$ = makeNotExpr((Node *) makeSimpleA_Expr(AEXPR_DISTINCT, ! "=", $1, $6, @2), ! @2); } | a_expr IS OF '(' type_list ')' %prec IS { *************** a_expr: c_expr { $$ = $1; } *** 11044,11086 **** */ | a_expr BETWEEN opt_asymmetric b_expr AND b_expr %prec BETWEEN { ! $$ = (Node *) makeA_Expr(AEXPR_AND, NIL, (Node *) makeSimpleA_Expr(AEXPR_OP, ">=", $1, $4, @2), (Node *) makeSimpleA_Expr(AEXPR_OP, "<=", $1, $6, @2), ! @2); } | a_expr NOT BETWEEN opt_asymmetric b_expr AND b_expr %prec BETWEEN { ! $$ = (Node *) makeA_Expr(AEXPR_OR, NIL, (Node *) makeSimpleA_Expr(AEXPR_OP, "<", $1, $5, @2), (Node *) makeSimpleA_Expr(AEXPR_OP, ">", $1, $7, @2), ! @2); } | a_expr BETWEEN SYMMETRIC b_expr AND b_expr %prec BETWEEN { ! $$ = (Node *) makeA_Expr(AEXPR_OR, NIL, ! (Node *) makeA_Expr(AEXPR_AND, NIL, (Node *) makeSimpleA_Expr(AEXPR_OP, ">=", $1, $4, @2), (Node *) makeSimpleA_Expr(AEXPR_OP, "<=", $1, $6, @2), ! @2), ! (Node *) makeA_Expr(AEXPR_AND, NIL, (Node *) makeSimpleA_Expr(AEXPR_OP, ">=", $1, $6, @2), (Node *) makeSimpleA_Expr(AEXPR_OP, "<=", $1, $4, @2), ! @2), ! @2); } | a_expr NOT BETWEEN SYMMETRIC b_expr AND b_expr %prec BETWEEN { ! $$ = (Node *) makeA_Expr(AEXPR_AND, NIL, ! (Node *) makeA_Expr(AEXPR_OR, NIL, (Node *) makeSimpleA_Expr(AEXPR_OP, "<", $1, $5, @2), (Node *) makeSimpleA_Expr(AEXPR_OP, ">", $1, $7, @2), ! @2), ! (Node *) makeA_Expr(AEXPR_OR, NIL, (Node *) makeSimpleA_Expr(AEXPR_OP, "<", $1, $7, @2), (Node *) makeSimpleA_Expr(AEXPR_OP, ">", $1, $5, @2), ! @2), ! @2); } | a_expr IN_P in_expr { --- 11045,11087 ---- */ | a_expr BETWEEN opt_asymmetric b_expr AND b_expr %prec BETWEEN { ! $$ = makeAndExpr( (Node *) makeSimpleA_Expr(AEXPR_OP, ">=", $1, $4, @2), (Node *) makeSimpleA_Expr(AEXPR_OP, "<=", $1, $6, @2), ! @2); } | a_expr NOT BETWEEN opt_asymmetric b_expr AND b_expr %prec BETWEEN { ! $$ = makeOrExpr( (Node *) makeSimpleA_Expr(AEXPR_OP, "<", $1, $5, @2), (Node *) makeSimpleA_Expr(AEXPR_OP, ">", $1, $7, @2), ! @2); } | a_expr BETWEEN SYMMETRIC b_expr AND b_expr %prec BETWEEN { ! $$ = makeOrExpr( ! makeAndExpr( (Node *) makeSimpleA_Expr(AEXPR_OP, ">=", $1, $4, @2), (Node *) makeSimpleA_Expr(AEXPR_OP, "<=", $1, $6, @2), ! @2), ! makeAndExpr( (Node *) makeSimpleA_Expr(AEXPR_OP, ">=", $1, $6, @2), (Node *) makeSimpleA_Expr(AEXPR_OP, "<=", $1, $4, @2), ! @2), ! @2); } | a_expr NOT BETWEEN SYMMETRIC b_expr AND b_expr %prec BETWEEN { ! $$ = makeAndExpr( ! makeOrExpr( (Node *) makeSimpleA_Expr(AEXPR_OP, "<", $1, $5, @2), (Node *) makeSimpleA_Expr(AEXPR_OP, ">", $1, $7, @2), ! @2), ! makeOrExpr( (Node *) makeSimpleA_Expr(AEXPR_OP, "<", $1, $7, @2), (Node *) makeSimpleA_Expr(AEXPR_OP, ">", $1, $5, @2), ! @2), ! @2); } | a_expr IN_P in_expr { *************** a_expr: c_expr { $$ = $1; } *** 11114,11120 **** n->operName = list_make1(makeString("=")); n->location = @3; /* Stick a NOT on top */ ! $$ = (Node *) makeA_Expr(AEXPR_NOT, NIL, NULL, (Node *) n, @2); } else { --- 11115,11121 ---- n->operName = list_make1(makeString("=")); n->location = @3; /* Stick a NOT on top */ ! $$ = makeNotExpr((Node *) n, @2); } else { *************** a_expr: c_expr { $$ = $1; } *** 11162,11171 **** } | a_expr IS NOT DOCUMENT_P %prec IS { ! $$ = (Node *) makeA_Expr(AEXPR_NOT, NIL, NULL, ! makeXmlExpr(IS_DOCUMENT, NULL, NIL, ! list_make1($1), @2), ! @2); } ; --- 11163,11171 ---- } | a_expr IS NOT DOCUMENT_P %prec IS { ! $$ = makeNotExpr(makeXmlExpr(IS_DOCUMENT, NULL, NIL, ! list_make1($1), @2), ! @2); } ; *************** b_expr: c_expr *** 11216,11223 **** } | b_expr IS NOT DISTINCT FROM b_expr %prec IS { ! $$ = (Node *) makeA_Expr(AEXPR_NOT, NIL, ! NULL, (Node *) makeSimpleA_Expr(AEXPR_DISTINCT, "=", $1, $6, @2), @2); } | b_expr IS OF '(' type_list ')' %prec IS { --- 11216,11224 ---- } | b_expr IS NOT DISTINCT FROM b_expr %prec IS { ! $$ = makeNotExpr((Node *) makeSimpleA_Expr(AEXPR_DISTINCT, ! "=", $1, $6, @2), ! @2); } | b_expr IS OF '(' type_list ')' %prec IS { *************** b_expr: c_expr *** 11234,11243 **** } | b_expr IS NOT DOCUMENT_P %prec IS { ! $$ = (Node *) makeA_Expr(AEXPR_NOT, NIL, NULL, ! makeXmlExpr(IS_DOCUMENT, NULL, NIL, ! list_make1($1), @2), ! @2); } ; --- 11235,11243 ---- } | b_expr IS NOT DOCUMENT_P %prec IS { ! $$ = makeNotExpr(makeXmlExpr(IS_DOCUMENT, NULL, NIL, ! list_make1($1), @2), ! @2); } ; *************** doNegateFloat(Value *v) *** 13693,13698 **** --- 13693,13738 ---- } static Node * + makeAndExpr(Node *lexpr, Node *rexpr, int location) + { + /* Flatten "a AND b AND c ..." to a single BoolExpr on sight */ + if (IsA(lexpr, BoolExpr)) + { + BoolExpr *blexpr = (BoolExpr *) lexpr; + + if (blexpr->boolop == AND_EXPR) + { + blexpr->args = lappend(blexpr->args, rexpr); + return (Node *) blexpr; + } + } + return (Node *) makeBoolExpr(AND_EXPR, list_make2(lexpr, rexpr), location); + } + + static Node * + makeOrExpr(Node *lexpr, Node *rexpr, int location) + { + /* Flatten "a OR b OR c ..." to a single BoolExpr on sight */ + if (IsA(lexpr, BoolExpr)) + { + BoolExpr *blexpr = (BoolExpr *) lexpr; + + if (blexpr->boolop == OR_EXPR) + { + blexpr->args = lappend(blexpr->args, rexpr); + return (Node *) blexpr; + } + } + return (Node *) makeBoolExpr(OR_EXPR, list_make2(lexpr, rexpr), location); + } + + static Node * + makeNotExpr(Node *expr, int location) + { + return (Node *) makeBoolExpr(NOT_EXPR, list_make1(expr), location); + } + + static Node * makeAArrayExpr(List *elements, int location) { A_ArrayExpr *n = makeNode(A_ArrayExpr); diff --git a/src/backend/parser/parse_clause.c b/src/backend/parser/parse_clause.c index aa704bb..a71f611 100644 *** a/src/backend/parser/parse_clause.c --- b/src/backend/parser/parse_clause.c *************** transformJoinUsingClause(ParseState *pst *** 332,338 **** RangeTblEntry *leftRTE, RangeTblEntry *rightRTE, List *leftVars, List *rightVars) { ! Node *result = NULL; ListCell *lvars, *rvars; --- 332,339 ---- RangeTblEntry *leftRTE, RangeTblEntry *rightRTE, List *leftVars, List *rightVars) { ! Node *result; ! List *andargs = NIL; ListCell *lvars, *rvars; *************** transformJoinUsingClause(ParseState *pst *** 358,375 **** copyObject(lvar), copyObject(rvar), -1); ! /* And combine into an AND clause, if multiple join columns */ ! if (result == NULL) ! result = (Node *) e; ! else ! { ! A_Expr *a; ! ! a = makeA_Expr(AEXPR_AND, NIL, result, (Node *) e, -1); ! result = (Node *) a; ! } } /* * Since the references are already Vars, and are certainly from the input * relations, we don't have to go through the same pushups that --- 359,374 ---- copyObject(lvar), copyObject(rvar), -1); ! /* Prepare to combine into an AND clause, if multiple join columns */ ! andargs = lappend(andargs, e); } + /* Only need an AND if there's more than one join column */ + if (list_length(andargs) == 1) + result = (Node *) linitial(andargs); + else + result = (Node *) makeBoolExpr(AND_EXPR, andargs, -1); + /* * Since the references are already Vars, and are certainly from the input * relations, we don't have to go through the same pushups that diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c index 81c9338..cb76133 100644 *** a/src/backend/parser/parse_expr.c --- b/src/backend/parser/parse_expr.c *************** bool Transform_null_equals = false; *** 41,55 **** static Node *transformExprRecurse(ParseState *pstate, Node *expr); static Node *transformParamRef(ParseState *pstate, ParamRef *pref); static Node *transformAExprOp(ParseState *pstate, A_Expr *a); - static Node *transformAExprAnd(ParseState *pstate, A_Expr *a); - static Node *transformAExprOr(ParseState *pstate, A_Expr *a); - static Node *transformAExprNot(ParseState *pstate, A_Expr *a); static Node *transformAExprOpAny(ParseState *pstate, A_Expr *a); static Node *transformAExprOpAll(ParseState *pstate, A_Expr *a); static Node *transformAExprDistinct(ParseState *pstate, A_Expr *a); static Node *transformAExprNullIf(ParseState *pstate, A_Expr *a); static Node *transformAExprOf(ParseState *pstate, A_Expr *a); static Node *transformAExprIn(ParseState *pstate, A_Expr *a); static Node *transformFuncCall(ParseState *pstate, FuncCall *fn); static Node *transformCaseExpr(ParseState *pstate, CaseExpr *c); static Node *transformSubLink(ParseState *pstate, SubLink *sublink); --- 41,53 ---- static Node *transformExprRecurse(ParseState *pstate, Node *expr); static Node *transformParamRef(ParseState *pstate, ParamRef *pref); static Node *transformAExprOp(ParseState *pstate, A_Expr *a); static Node *transformAExprOpAny(ParseState *pstate, A_Expr *a); static Node *transformAExprOpAll(ParseState *pstate, A_Expr *a); static Node *transformAExprDistinct(ParseState *pstate, A_Expr *a); static Node *transformAExprNullIf(ParseState *pstate, A_Expr *a); static Node *transformAExprOf(ParseState *pstate, A_Expr *a); static Node *transformAExprIn(ParseState *pstate, A_Expr *a); + static Node *transformBoolExpr(ParseState *pstate, BoolExpr *a); static Node *transformFuncCall(ParseState *pstate, FuncCall *fn); static Node *transformCaseExpr(ParseState *pstate, CaseExpr *c); static Node *transformSubLink(ParseState *pstate, SubLink *sublink); *************** transformExprRecurse(ParseState *pstate, *** 223,237 **** case AEXPR_OP: result = transformAExprOp(pstate, a); break; - case AEXPR_AND: - result = transformAExprAnd(pstate, a); - break; - case AEXPR_OR: - result = transformAExprOr(pstate, a); - break; - case AEXPR_NOT: - result = transformAExprNot(pstate, a); - break; case AEXPR_OP_ANY: result = transformAExprOpAny(pstate, a); break; --- 221,226 ---- *************** transformExprRecurse(ParseState *pstate, *** 258,263 **** --- 247,256 ---- break; } + case T_BoolExpr: + result = transformBoolExpr(pstate, (BoolExpr *) expr); + break; + case T_FuncCall: result = transformFuncCall(pstate, (FuncCall *) expr); break; *************** transformExprRecurse(ParseState *pstate, *** 337,343 **** case T_DistinctExpr: case T_NullIfExpr: case T_ScalarArrayOpExpr: - case T_BoolExpr: case T_FieldSelect: case T_FieldStore: case T_RelabelType: --- 330,335 ---- *************** transformAExprOp(ParseState *pstate, A_E *** 919,964 **** } static Node * - transformAExprAnd(ParseState *pstate, A_Expr *a) - { - Node *lexpr = transformExprRecurse(pstate, a->lexpr); - Node *rexpr = transformExprRecurse(pstate, a->rexpr); - - lexpr = coerce_to_boolean(pstate, lexpr, "AND"); - rexpr = coerce_to_boolean(pstate, rexpr, "AND"); - - return (Node *) makeBoolExpr(AND_EXPR, - list_make2(lexpr, rexpr), - a->location); - } - - static Node * - transformAExprOr(ParseState *pstate, A_Expr *a) - { - Node *lexpr = transformExprRecurse(pstate, a->lexpr); - Node *rexpr = transformExprRecurse(pstate, a->rexpr); - - lexpr = coerce_to_boolean(pstate, lexpr, "OR"); - rexpr = coerce_to_boolean(pstate, rexpr, "OR"); - - return (Node *) makeBoolExpr(OR_EXPR, - list_make2(lexpr, rexpr), - a->location); - } - - static Node * - transformAExprNot(ParseState *pstate, A_Expr *a) - { - Node *rexpr = transformExprRecurse(pstate, a->rexpr); - - rexpr = coerce_to_boolean(pstate, rexpr, "NOT"); - - return (Node *) makeBoolExpr(NOT_EXPR, - list_make1(rexpr), - a->location); - } - - static Node * transformAExprOpAny(ParseState *pstate, A_Expr *a) { Node *lexpr = transformExprRecurse(pstate, a->lexpr); --- 911,916 ---- *************** transformAExprIn(ParseState *pstate, A_E *** 1238,1243 **** --- 1190,1231 ---- } static Node * + transformBoolExpr(ParseState *pstate, BoolExpr *a) + { + List *args = NIL; + const char *opname; + ListCell *lc; + + switch (a->boolop) + { + case AND_EXPR: + opname = "AND"; + break; + case OR_EXPR: + opname = "OR"; + break; + case NOT_EXPR: + opname = "NOT"; + break; + default: + elog(ERROR, "unrecognized boolop: %d", (int) a->boolop); + opname = NULL; /* keep compiler quiet */ + break; + } + + foreach(lc, a->args) + { + Node *arg = (Node *) lfirst(lc); + + arg = transformExprRecurse(pstate, arg); + arg = coerce_to_boolean(pstate, arg, opname); + args = lappend(args, arg); + } + + return (Node *) makeBoolExpr(a->boolop, args, a->location); + } + + static Node * transformFuncCall(ParseState *pstate, FuncCall *fn) { List *targs; *************** make_row_comparison_op(ParseState *pstat *** 2428,2437 **** /* * For = and <> cases, we just combine the pairwise operators with AND or * OR respectively. - * - * Note: this is presently the only place where the parser generates - * BoolExpr with more than two arguments. Should be OK since the rest of - * the system thinks BoolExpr is N-argument anyway. */ if (rctype == ROWCOMPARE_EQ) return (Node *) makeBoolExpr(AND_EXPR, opexprs, location); --- 2416,2421 ---- diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h index 18d4991..59916eb 100644 *** a/src/include/nodes/parsenodes.h --- b/src/include/nodes/parsenodes.h *************** typedef struct ParamRef *** 225,233 **** typedef enum A_Expr_Kind { AEXPR_OP, /* normal operator */ - AEXPR_AND, /* booleans - name field is unused */ - AEXPR_OR, - AEXPR_NOT, AEXPR_OP_ANY, /* scalar op ANY (array) */ AEXPR_OP_ALL, /* scalar op ALL (array) */ AEXPR_DISTINCT, /* IS DISTINCT FROM - name must be "=" */ --- 225,230 ---- diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h index 9cce60b..ed73109 100644 *** a/src/include/nodes/primnodes.h --- b/src/include/nodes/primnodes.h *************** typedef struct ScalarArrayOpExpr *** 458,469 **** * BoolExpr - expression node for the basic Boolean operators AND, OR, NOT * * Notice the arguments are given as a List. For NOT, of course the list ! * must always have exactly one element. For AND and OR, the executor can ! * handle any number of arguments. The parser generally treats AND and OR ! * as binary and so it typically only produces two-element lists, but the ! * optimizer will flatten trees of AND and OR nodes to produce longer lists ! * when possible. There are also a few special cases where more arguments ! * can appear before optimization. */ typedef enum BoolExprType { --- 458,465 ---- * BoolExpr - expression node for the basic Boolean operators AND, OR, NOT * * Notice the arguments are given as a List. For NOT, of course the list ! * must always have exactly one element. For AND and OR, there can be two ! * or more arguments. */ typedef enum BoolExprType { diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out index 6c51d0d..c3188a7 100644 *** a/src/test/regress/expected/rules.out --- b/src/test/regress/expected/rules.out *************** shoe_ready| SELECT rsh.shoename, *** 2117,2123 **** int4smaller(rsh.sh_avail, rsl.sl_avail) AS total_avail FROM shoe rsh, shoelace rsl ! WHERE (((rsl.sl_color = rsh.slcolor) AND (rsl.sl_len_cm >= rsh.slminlen_cm)) AND (rsl.sl_len_cm <= rsh.slmaxlen_cm)); shoelace| SELECT s.sl_name, s.sl_avail, s.sl_color, --- 2117,2123 ---- int4smaller(rsh.sh_avail, rsl.sl_avail) AS total_avail FROM shoe rsh, shoelace rsl ! WHERE ((rsl.sl_color = rsh.slcolor) AND (rsl.sl_len_cm >= rsh.slminlen_cm) AND (rsl.sl_len_cm <= rsh.slmaxlen_cm)); shoelace| SELECT s.sl_name, s.sl_avail, s.sl_color,
I wrote: > Gurjeet Singh <gurjeet@singh.im> writes: >> I tried to eliminate the 'pending' list, but I don't see a way around it. >> We need temporary storage somewhere to store the branches encountered on >> the right; in recursion case the call stack was serving that purpose. > I still think we should fix this in the grammar, rather than introducing > complicated logic to try to get rid of the recursion later. For example, > as attached. I went looking for (and found) some additional obsoleted comments, and convinced myself that ruleutils.c is okay as-is, and pushed this. regards, tom lane
Thanks! On Mon, Jun 16, 2014 at 3:58 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I wrote: >> Gurjeet Singh <gurjeet@singh.im> writes: >>> I tried to eliminate the 'pending' list, but I don't see a way around it. >>> We need temporary storage somewhere to store the branches encountered on >>> the right; in recursion case the call stack was serving that purpose. > >> I still think we should fix this in the grammar, rather than introducing >> complicated logic to try to get rid of the recursion later. For example, >> as attached. > > I went looking for (and found) some additional obsoleted comments, and > convinced myself that ruleutils.c is okay as-is, and pushed this. > > regards, tom lane -- Gurjeet Singh http://gurjeet.singh.im/ EDB www.EnterpriseDB.com