Обсуждение: BUG #5025: Aggregate function with subquery in 8.3 and 8.4.
The following bug has been logged online:
Bug reference: 5025
Logged by: Sheng Y. Cheng
Email address: scheng@adconion.com
PostgreSQL version: 8.4.0 / 8.3.1
Operating system: Red Hat 4.1.1-52
Description: Aggregate function with subquery in 8.3 and 8.4.
Details:
Here are some facts and questions about the aggregate function with
subquery
in 8.3 and 8.4.
================= Question 1. ==================
I though the following query would give me the same results in 8.4.0 and
8.3.1.
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
BEGIN;
SELECT version();
CREATE TEMPORARY TABLE t1 (f1 text ) on commit drop ;
CREATE TEMPORARY TABLE t2 (f1 text ) on commit drop ;
INSERT INTO t1 (f1) VALUES ('aaa');
INSERT INTO t1 (f1) VALUES ('bbb');
INSERT INTO t1 (f1) VALUES ('ccc');
INSERT INTO t2 (f1) VALUES ('bbb');
SELECT t1.f1, COUNT(ts.*) FROM
t1
LEFT JOIN
(SELECT
CASE WHEN f1 = '111'
THEN '111'
ELSE
f1
END
FROM t2) AS ts
ON
t1.f1 = ts.f1
GROUP BY
t1.f1;
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
However, In 8.3.1 I got the following.
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
BEGIN
version
----------------------------------------------------------------------------
-------------------------------
PostgreSQL 8.3.1 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC)
4.1.1 20070105 (Red Hat 4.1.1-52)
(1 row)
CREATE TABLE
CREATE TABLE
INSERT 0 1
INSERT 0 1
INSERT 0 1
INSERT 0 1
f1 | count
-----+-------
aaa | 0
bbb | 1
ccc | 0
(3 rows)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Whereas, in 8.4.0 I got the following.
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
BEGIN
version
----------------------------------------------------------------------------
---------------------------------------
PostgreSQL 8.4.0 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC)
4.1.1 20070105 (Red Hat 4.1.1-52), 64-bit
(1 row)
CREATE TABLE
CREATE TABLE
INSERT 0 1
INSERT 0 1
INSERT 0 1
INSERT 0 1
f1 | count
-----+-------
aaa | 1
bbb | 1
ccc | 1
(3 rows)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The Session 4.2.7. Aggregate Expressions in 8.3 document at
http://www.postgresql.org/docs/8.3/static/sql-expressions.html states "The
last form invokes the aggregate once for each input row regardless of null
or non-null values." I am wondering if the result I saw from 8.4.0 is a bug
fix for 8.3.1?
================= Question 2. ==================
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
BEGIN;
SELECT version();
CREATE TEMPORARY TABLE t1 (f1 text ) on commit drop ;
CREATE TEMPORARY TABLE t2 (f1 text ) on commit drop ;
INSERT INTO t1 (f1) VALUES ('aaa');
INSERT INTO t1 (f1) VALUES ('bbb');
INSERT INTO t1 (f1) VALUES ('ccc');
INSERT INTO t2 (f1) VALUES ('bbb');
SELECT t1.f1, COUNT(ts.*) FROM
t1
LEFT JOIN
(SELECT
f1
FROM t2) AS ts
ON
t1.f1 = ts.f1
GROUP BY
t1.f1;
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
I though the result of the above query would be the following.
f1 | count
-----+-------
aaa | 0
bbb | 1
ccc | 0
however, I got the following in both 8.4.0 and 8.3.1.
Result from 8.3.1.
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
BEGIN
version
----------------------------------------------------------------------------
-------------------------------
PostgreSQL 8.3.1 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC)
4.1.1 20070105 (Red Hat 4.1.1-52)
(1 row)
CREATE TABLE
CREATE TABLE
INSERT 0 1
INSERT 0 1
INSERT 0 1
INSERT 0 1
f1 | count
-----+-------
aaa | 1
bbb | 1
ccc | 1
(3 rows)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Result from 8.4.0.
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
BEGIN
version
----------------------------------------------------------------------------
---------------------------------------
PostgreSQL 8.4.0 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC)
4.1.1 20070105 (Red Hat 4.1.1-52), 64-bit
(1 row)
CREATE TABLE
CREATE TABLE
INSERT 0 1
INSERT 0 1
INSERT 0 1
INSERT 0 1
f1 | count
-----+-------
aaa | 1
bbb | 1
ccc | 1
(3 rows)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Is this how Postgres works for aggregate function?
Thank you,
Sheng
Sheng Y. Cheng wrote: > The Session 4.2.7. Aggregate Expressions in 8.3 document at > http://www.postgresql.org/docs/8.3/static/sql-expressions.html states "The > last form invokes the aggregate once for each input row regardless of null > or non-null values." I am wondering if the result I saw from 8.4.0 is a bug > fix for 8.3.1? Well, a COUNT(ts.*) is in fact not of the last form, but the first. "ts.*" is a whole-row reference to t2, like just "ts" would be (as in "COUNT(t2)"). But there indeed seems to be something wrong. The query can be reduced into: SELECT t1.f1, COUNT(ts) FROM t1 LEFT JOIN (SELECT f1 As f1 FROM t2 OFFSET 0) AS ts ON t1.f1 = ts.f1 GROUP BY t1.f1; With this you can reproduce the discrepancy in CVS HEAD alone - the query produces a different result if you remove the "OFFSET 0": postgres=# SELECT t1.f1, COUNT(ts) FROM t1 postgres-# LEFT JOIN postgres-# (SELECT f1 As f1 FROM t2 OFFSET 0) AS ts postgres-# ON t1.f1 = ts.f1 postgres-# GROUP BY t1.f1; f1 | count -----+------- aaa | 0 bbb | 1 ccc | 0 (3 rows) postgres=# SELECT t1.f1, COUNT(ts) FROM t1 LEFT JOIN (SELECT f1 As f1 FROM t2 /* OFFSET 0 */) AS ts ON t1.f1 = ts.f1 GROUP BY t1.f1; f1 | count -----+------- aaa | 1 bbb | 1 ccc | 1 (3 rows) Without the OFFSET, the subquery is "pulled up" into the top query, and that optimization makes the difference. PostgreSQL 8.4 is more aggressive at that optimization, which is why you saw different results on 8.3 and 8.4. The "pullup" code transforms the "ts" reference into a ROW constructor: postgres=# explain verbose SELECT t1.f1, COUNT(ts) FROM t1 LEFT JOIN (SELECT f1 As f1 FROM t2 /* OFFSET 0 */) AS ts ON t1.f1 = ts.f1 GROUP BY t1.f1; QUERY PLAN -------------------------------------------------------------------------------- GroupAggregate (cost=181.86..362.51 rows=200 width=64) Output: t1.f1, count(ROW(t2.f1)) ... That transformation isn't 100% accurate. A ROW expression with all NULL columns is not the same thing as a NULL whole-row expression when it comes to STRICT functions - a strict function is invoked with the former, but not the latter, even though both return true for an IS NULL test. That let's us write the test case as: CREATE FUNCTION stricttest (a anyelement) RETURNS boolean AS $$ SELECT true; $$ LANGUAGE SQL STRICT; postgres=# SELECT t1.f1, stricttest(ts) FROM t1 LEFT JOIN (SELECT f1 As f1 FROM t2 OFFSET 0) AS ts ON t1.f1 = ts.f1; f1 | stricttest -----+------------ aaa | bbb | t ccc | (3 rows) postgres=# SELECT t1.f1, stricttest(ts) FROM t1 LEFT JOIN (SELECT f1 As f1 FROM t2 /* OFFSET 0 */) AS ts ON t1.f1 = ts.f1; f1 | stricttest -----+------------ aaa | t bbb | t ccc | t (3 rows) I can see two possible interpretations for this: 1. The subquery pull-up code is broken, the transformation of a whole-row reference to ROW(...) is not valid. 2. The semantics of STRICT with row arguments is broken. It should be made consistent with IS NULL. Strict function should not be called if the argument is a row value with all NULL columns. I'm not sure which interpretation is correct. Thoughts? The SQL spec probably has something to say about this. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
2009/9/1 Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>: > I can see two possible interpretations for this: > > 1. The subquery pull-up code is broken, the transformation of a > whole-row reference to ROW(...) is not valid. > > 2. The semantics of STRICT with row arguments is broken. It should be > made consistent with IS NULL. Strict function should not be called if > the argument is a row value with all NULL columns. > > I'm not sure which interpretation is correct. Thoughts? The SQL spec > probably has something to say about this. I suppose ts.* that wasn't joined is NULL. Not "a row value with all NULL columns" but "a NULL row value". contrib_regression=# SELECT t1.f1, ts.* IS NULL, ts.* FROM t1 LEFT JOIN (SELECT f1 FROM t2 -- offset 0 ) AS ts ON t1.f1 = ts.f1 ; f1 | ?column? | f1 -----+----------+----- aaa | t | bbb | f | bbb ccc | t | (3 rows) So the 1. ROW(...) construction seems not valid. > > -- > Heikki Linnakangas > EnterpriseDB http://www.enterprisedb.com > > -- > Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-bugs > Regards, -- Hitoshi Harada
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> I can see two possible interpretations for this:
> 1. The subquery pull-up code is broken, the transformation of a
> whole-row reference to ROW(...) is not valid.
I think the problem is probably that we need a PlaceHolderVar wrapper
around the ROW() constructor.
> 2. The semantics of STRICT with row arguments is broken. It should be
> made consistent with IS NULL.
Well, that's a different argument. The immediate problem is that this
case doesn't behave consistently with pre-8.4 behavior, which was not
an intended change --- so I think we'd better make it work like before.
regards, tom lane
On Tue, Sep 1, 2009 at 4:11 AM, Heikki Linnakangas<heikki.linnakangas@enterprisedb.com> wrote: > 2. The semantics of STRICT with row arguments is broken. It should be > made consistent with IS NULL. Strict function should not be called if > the argument is a row value with all NULL columns. not just STRICT, but coalesce(), libpq 'is null' bit, plpgsql row variables, type input/output, joins, etc. see recent threads on -hackers and -bugs note that fixing this would break code in the field (like mine for example) :-). merlin
I wrote:
> I think the problem is probably that we need a PlaceHolderVar wrapper
> around the ROW() constructor.
I looked into this a bit more. The issue is that flattening of
subqueries that are inside an outer join first puts PlaceHolderVars
around non-nullable output expressions of the subquery, and then applies
ResolveNew() to substitute those expressions for upper references to the
subquery outputs. However, a whole-row Var referencing the subquery is
expanded by ResolveNew into a ROW() expression over the subquery
outputs. To preserve compatibility with pre-8.4 behavior, what we
really need here is to have a PlaceHolderVar around the ROW(), not on
the individual expressions inside it.
While it's not that hard to put in the PlaceHolderVar, the easy way to
do it would require passing the PlannerInfo "root" struct to ResolveNew,
which goes well beyond my threshold of pain from a modularity
standpoint --- ResolveNew isn't part of the planner and has no business
using that struct. ResolveNew's API is a study in ugliness already,
so I'm thinking it's time to bite the bullet and refactor it. The
idea that comes to mind is to provide a callback function and "void *"
context argument, which ResolveNew would call upon finding a Var that
needs substitution. The existing guts of ResolveNew would move into a
callback that is specific to rewriteHandler.c's uses, and we'd write a
different one for the planner. The PlannerInfo link would be hidden
within the "void *" argument so we'd avoid exposing it to rewriter code.
Comments?
I believe BTW that there are related issues in other places where we
expand composites into RowExprs. But the other places have been doing
that for awhile. I think that for 8.4 our goals should be limited to
not changing the behavior compared to prior releases. If any consensus
is reached on the general issue of how we want "row() is null" to
behave, then it'll be time to reconsider the other stuff.
regards, tom lane
I blithely opined:
> I believe BTW that there are related issues in other places where we
> expand composites into RowExprs. But the other places have been doing
> that for awhile. I think that for 8.4 our goals should be limited to
> not changing the behavior compared to prior releases.
So while I was testing my fix for this, I found out that that's more
complicated than I thought. Consider these examples in the regression
database:
select t1.q2, count(t2.*)
from int8_tbl t1 left join int8_tbl t2 on (t1.q2 = t2.q1)
group by t1.q2 order by 1;
select t1.q2, count(t2.*)
from int8_tbl t1 left join (select * from int8_tbl) t2 on (t1.q2 = t2.q1)
group by t1.q2 order by 1;
select t1.q2, count(t2.*)
from int8_tbl t1 left join (select * from int8_tbl offset 0) t2 on (t1.q2 = t2.q1)
group by t1.q2 order by 1;
select t1.q2, count(t2.*)
from int8_tbl t1 left join
(select q1, case when q2=1 then 1 else q2 end as q2 from int8_tbl) t2
on (t1.q2 = t2.q1)
group by t1.q2 order by 1;
If you believe that "t2.*" should go to NULL in a join-extended row,
then the correct answer for all four of these is
q2 | count
-------------------+-------
-4567890123456789 | 0
123 | 2
456 | 0
4567890123456789 | 6
(4 rows)
However, the actual behavior of every release since 8.0 has been that
the second case gives
q2 | count
-------------------+-------
-4567890123456789 | 1
123 | 2
456 | 1
4567890123456789 | 6
(4 rows)
ie, t2.* fails to go to NULL because it's expanded as ROW(t2.q1,t2.q2).
The OFFSET 0 in the third case restores expected behavior by preventing
flattening of the subquery, and up till 8.4 the CASE expression in the
fourth case did too.
With the fix I was just about to apply, all four cases give the first
set of results. This clearly satisfies the principle of least
astonishment, at least more nearly than what we have; but it equally
clearly is *not* going to restore 8.4 to work just like 8.3.
I'm inclined to apply the patch to 8.4 anyway, because it seems like a
bug fix. I would consider patching further back except there's no
chance of making it work in older branches, at least not without
destabilizing them quite a bit (the PlaceHolderVar mechanism would have
to be back-ported).
It might be possible to fix the older branches by not flattening
subqueries that have whole-row references; but even that would take
nontrivial work, and it'd be sacrificing performance to fix a corner
case no one has previously complained about. So I'm leaning to patching
8.4 and leaving the older branches alone.
Thoughts?
regards, tom lane
Tom Lane wrote: > With the fix I was just about to apply, all four cases give the first > set of results. This clearly satisfies the principle of least > astonishment, at least more nearly than what we have; but it equally > clearly is *not* going to restore 8.4 to work just like 8.3. Right, 8.3 had the same underlying problem, 8.4 just makes it more visible as it's better at flattening subqueries. > I'm inclined to apply the patch to 8.4 anyway, because it seems like a > bug fix. I would consider patching further back except there's no > chance of making it work in older branches, at least not without > destabilizing them quite a bit (the PlaceHolderVar mechanism would have > to be back-ported). > > It might be possible to fix the older branches by not flattening > subqueries that have whole-row references; but even that would take > nontrivial work, and it'd be sacrificing performance to fix a corner > case no one has previously complained about. So I'm leaning to patching > 8.4 and leaving the older branches alone. > > Thoughts? Seems reasonable. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> Tom Lane wrote:
>> With the fix I was just about to apply, all four cases give the first
>> set of results. This clearly satisfies the principle of least
>> astonishment, at least more nearly than what we have; but it equally
>> clearly is *not* going to restore 8.4 to work just like 8.3.
> Right, 8.3 had the same underlying problem, 8.4 just makes it more
> visible as it's better at flattening subqueries.
What is interesting is that the CASE in the OP's original submission
is apparently only there to dodge the visible-since-8.0 version of
the problem; at least I can't see that it does anything else useful.
The complaint apparently is not so much that 8.3 was right, as that
the workaround for its bug stopped working ...
>> ... So I'm leaning to patching
>> 8.4 and leaving the older branches alone.
>>
>> Thoughts?
> Seems reasonable.
Will apply fix shortly --- I thought of one minor improvement to
make (the code as it stands is generating redundant PlaceHolderVars).
regards, tom lane
Tom Lane wrote: > Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: > > Tom Lane wrote: > >> With the fix I was just about to apply, all four cases give the first > >> set of results. This clearly satisfies the principle of least > >> astonishment, at least more nearly than what we have; but it equally > >> clearly is *not* going to restore 8.4 to work just like 8.3. > > > Right, 8.3 had the same underlying problem, 8.4 just makes it more > > visible as it's better at flattening subqueries. > > What is interesting is that the CASE in the OP's original submission > is apparently only there to dodge the visible-since-8.0 version of > the problem; at least I can't see that it does anything else useful. > The complaint apparently is not so much that 8.3 was right, as that > the workaround for its bug stopped working ... In that light, it probably doesn't make much sense to backport the fix further back, given that the people (person?) bitten by the bug surely must already be working around it. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Tom Lane wrote:
>> What is interesting is that the CASE in the OP's original submission
>> is apparently only there to dodge the visible-since-8.0 version of
>> the problem; at least I can't see that it does anything else useful.
>> The complaint apparently is not so much that 8.3 was right, as that
>> the workaround for its bug stopped working ...
> In that light, it probably doesn't make much sense to backport the fix
> further back, given that the people (person?) bitten by the bug surely
> must already be working around it.
Yeah, and there might even be people relying on the incorrect behavior.
I'm not too worried about changing it in 8.4 since that's such a young
release, but it's a bit harder to make a case for that in older
branches.
regards, tom lane