Обсуждение: GSets: Fix bug involving GROUPING and HAVING together
Hi,
I have observed some fishy behavior related to GROUPING in HAVING clause
and when we have only one element in GROUPING SETS.
Basically, when we have only one element in GROUING SETS, we are assuming
it as a simple GROUP BY with one column. Due to which we are ending up with
this error.
If we have ROLLUP/CUBE or GROUPING SETS with multiple elements, then we are
not getting this error.
Here are some examples:
postgres=# select ten, grouping(ten) from onek
postgres-# group by grouping sets(ten) having grouping(ten) >= 0
postgres-# order by 2,1;
ERROR: parent of GROUPING is not Agg node
postgres=# select ten, grouping(ten) from onek
postgres-# group by grouping sets(ten, four) having grouping(ten) > 0
postgres-# order by 2,1;
ten | grouping
-----+----------
(0 rows)fix_bug_related_to_grouping_v1.patch
postgres=# select ten, grouping(ten) from onek
postgres-# group by rollup(ten) having grouping(ten) > 0
postgres-# order by 2,1;
ten | grouping
-----+----------
| 1
(1 row)
postgres=# select ten, grouping(ten) from onek
postgres-# group by cube(ten) having grouping(ten) > 0
postgres-# order by 2,1;
ten | grouping
-----+----------
| 1
(1 row)
I had a look over relevant code and found that contain_agg_clause_walker()
is not considering GroupingFunc as an aggregate node, due to which it is
failing to consider it in a having clause in subquery_planner().
Fix this by adding GroupingFunc node in this walker. We do it correctly in
contain_aggs_of_level_walker() in which we have handling for GroupingFunc
there.
Attached patch to fix this.
The side effect is that, if we have plain group by clause, then too we can
use GROUPING in HAVING clause. But I guess it is fine.
Let me know if I missed anything to consider here.
Thanks
--
I have observed some fishy behavior related to GROUPING in HAVING clause
and when we have only one element in GROUPING SETS.
Basically, when we have only one element in GROUING SETS, we are assuming
it as a simple GROUP BY with one column. Due to which we are ending up with
this error.
If we have ROLLUP/CUBE or GROUPING SETS with multiple elements, then we are
not getting this error.
Here are some examples:
postgres=# select ten, grouping(ten) from onek
postgres-# group by grouping sets(ten) having grouping(ten) >= 0
postgres-# order by 2,1;
ERROR: parent of GROUPING is not Agg node
postgres=# select ten, grouping(ten) from onek
postgres-# group by grouping sets(ten, four) having grouping(ten) > 0
postgres-# order by 2,1;
ten | grouping
-----+----------
(0 rows)fix_bug_related_to_grouping_v1.patch
postgres=# select ten, grouping(ten) from onek
postgres-# group by rollup(ten) having grouping(ten) > 0
postgres-# order by 2,1;
ten | grouping
-----+----------
| 1
(1 row)
postgres=# select ten, grouping(ten) from onek
postgres-# group by cube(ten) having grouping(ten) > 0
postgres-# order by 2,1;
ten | grouping
-----+----------
| 1
(1 row)
I had a look over relevant code and found that contain_agg_clause_walker()
is not considering GroupingFunc as an aggregate node, due to which it is
failing to consider it in a having clause in subquery_planner().
Fix this by adding GroupingFunc node in this walker. We do it correctly in
contain_aggs_of_level_walker() in which we have handling for GroupingFunc
there.
Attached patch to fix this.
The side effect is that, if we have plain group by clause, then too we can
use GROUPING in HAVING clause. But I guess it is fine.
Let me know if I missed anything to consider here.
Thanks
--
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
Вложения
>>>>> "Jeevan" == Jeevan Chalke <jeevan.chalke@enterprisedb.com> writes: Jeevan> Basically, when we have only one element in GROUING SETS, weJeevan> are assuming it as a simple GROUP BY with onecolumn. Due toJeevan> which we are ending up with this error. Jeevan> If we have ROLLUP/CUBE or GROUPING SETS with multiple elements,Jeevan> then we are not getting this error. There's two issues here. One you correctly identified, which is that contain_agg_clause() should be true for GroupingFuncs too. The other is that in subquery_planner, the optimization of converting HAVING clauses to WHERE clauses is suppressed if parse->groupingSets isn't empty. (It is empty if there's either no group by clause at all, or if there's exactly one grouping set, which must not be empty, however derived.) This is costing us some optimizations, especially in the case of an explicit GROUP BY () clause; I'll look into this. In the meantime your patch looks OK (and necessary) to me. Jeevan> The side effect is that, if we have plain group by clause, thenJeevan> too we can use GROUPING in HAVING clause.But I guess it isJeevan> fine. GROUPING is, per spec, explicitly allowed anywhere you could have an aggregate call, whether the group by clause is plain or not. -- Andrew (irc:RhodiumToad)
On Tue, Jul 14, 2015 at 4:23 PM, Andrew Gierth <andrew@tao11.riddles.org.uk> wrote:
OK. Thanks
>>>>> "Jeevan" == Jeevan Chalke <jeevan.chalke@enterprisedb.com> writes:
Jeevan> Basically, when we have only one element in GROUING SETS, we
Jeevan> are assuming it as a simple GROUP BY with one column. Due to
Jeevan> which we are ending up with this error.
Jeevan> If we have ROLLUP/CUBE or GROUPING SETS with multiple elements,
Jeevan> then we are not getting this error.
There's two issues here. One you correctly identified, which is that
contain_agg_clause() should be true for GroupingFuncs too.
The other is that in subquery_planner, the optimization of converting
HAVING clauses to WHERE clauses is suppressed if parse->groupingSets
isn't empty. (It is empty if there's either no group by clause at all,
or if there's exactly one grouping set, which must not be empty, however
derived.) This is costing us some optimizations, especially in the case
of an explicit GROUP BY () clause;
I have observed that. But I thought that it is indeed intentional.
As we don't want to choose code path for GSets when we have
only one column which is converted to plain group by and
thus setting parse->groupingSets to FALSE.I'll look into this.
In the meantime your patch looks OK (and necessary) to me.
Jeevan> The side effect is that, if we have plain group by clause, then
Jeevan> too we can use GROUPING in HAVING clause. But I guess it is
Jeevan> fine.
GROUPING is, per spec, explicitly allowed anywhere you could have an
aggregate call, whether the group by clause is plain or not.
--
Andrew (irc:RhodiumToad)
--
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
>>>>> "Andrew" == Andrew Gierth <andrew@tao11.riddles.org.uk> writes: Andrew> The other is that in subquery_planner, the optimization of Andrew> converting HAVING clauses to WHERE clauses is suppressed if Andrew> parse->groupingSets isn't empty. (It is empty if there's either Andrew> no group by clause at all, or if there's exactly one grouping Andrew> set, which must not be empty, however derived.) This is costing Andrew> us some optimizations, especially in the case of an explicit Andrew> GROUP BY () clause; I'll look into this. I'm inclined to go with the simplest approach here, which copies the quals if there are grouping sets. The only way we could safely move a qual without copying is if we can show that none of the grouping sets is empty, that is (), and at this point the grouping set list has not been expanded so it's not trivial to determine that. Patch attached. -- Andrew (irc:RhodiumToad)
Вложения
Hi,
This will fail too.we add that in group by list and set parse->groupingSets to NULL.
And hence it will have same issue.
However tests added in my patch failing too.
--
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
>>>>> "Jeevan" == Jeevan Chalke <jeevan.chalke@enterprisedb.com> writes: Jeevan> Hi,Jeevan> This will fail too. This is in addition to your patch, not instead of it. -- Andrew (irc:RhodiumToad)
On 2015-07-14 14:51:09 +0530, Jeevan Chalke wrote: > Fix this by adding GroupingFunc node in this walker. We do it correctly in > contain_aggs_of_level_walker() in which we have handling for GroupingFunc > there. > > Attached patch to fix this. Pushed, thanks for fix!
On 2015-07-24 11:34:22 +0100, Andrew Gierth wrote: > >>>>> "Andrew" == Andrew Gierth <andrew@tao11.riddles.org.uk> writes: > > Andrew> The other is that in subquery_planner, the optimization of > Andrew> converting HAVING clauses to WHERE clauses is suppressed if > Andrew> parse->groupingSets isn't empty. (It is empty if there's either > Andrew> no group by clause at all, or if there's exactly one grouping > Andrew> set, which must not be empty, however derived.) This is costing > Andrew> us some optimizations, especially in the case of an explicit > Andrew> GROUP BY () clause; I'll look into this. > > I'm inclined to go with the simplest approach here, which copies the > quals if there are grouping sets. The only way we could safely move a > qual without copying is if we can show that none of the grouping sets is > empty, that is (), and at this point the grouping set list has not been > expanded so it's not trivial to determine that. I pushed this to both master and 9.5. While this isn't strictly a bugfix, it seemed better to keep the branches the same at this point.