Обсуждение: Query with aggregate over subselects fails with ERROR: plan should not reference subplan's variable

Поиск
Список
Период
Сортировка

Query with aggregate over subselects fails with ERROR: plan should not reference subplan's variable

От
Daniel Grace
Дата:
The following nonsensical query causes PostgreSQL to fail with ERROR: plan
should not reference subplan's variable.  (This was stripped down from an
'useful' query that triggered the same bug).  First encountered on 8.3.4,
reproduced on 8.3.7

BEGIN;

CREATE SCHEMA bug_schema;
SET SEARCH_PATH='bug_schema';

CREATE FUNCTION AGG_GROUP_CONCAT_SFUNC(IN _state TEXT, IN _str TEXT, IN _sep
TEXT) RETURNS TEXT
SECURITY INVOKER
LANGUAGE PLPGSQL
IMMUTABLE
CALLED ON NULL INPUT
AS $PROC$
BEGIN
    IF _str IS NULL THEN RETURN _state; END IF;
    IF _state IS NULL THEN RETURN _str; END IF;
    RETURN _state || _sep || _str;
END;
$PROC$;


CREATE AGGREGATE GROUP_CONCAT(TEXT, TEXT) (
    STYPE = TEXT,
    SFUNC = AGG_GROUP_CONCAT_SFUNC
);

CREATE TABLE foo (
    id    serial        NOT NULL,
    fname    varchar(64)        NOT NULL,
    PRIMARY KEY (id)
);

-- Fails: ERROR:  plan should not reference subplan's variable
SELECT (SELECT GROUP_CONCAT((SELECT s2.fname FROM foo AS s2 WHERE
s2.id=s.idORDER BY fname), '; ')) AS foolist FROM foo AS s;

-- Also fails, same error
SELECT (SELECT MAX((SELECT s2.fname FROM foo AS s2 WHERE s2.id=s.id ORDER BY
fname))) AS foomaxFROM foo AS s;

ROLLBACK;
Daniel Grace <dgrace@wingsnw.com> writes:
> The following nonsensical query causes PostgreSQL to fail with ERROR: plan
> should not reference subplan's variable.  (This was stripped down from an
> 'useful' query that triggered the same bug).  First encountered on 8.3.4,
> reproduced on 8.3.7

Hmmm ... I guess something is getting confused about the level of query
nesting.  FWIW, you can avoid the bug in these two examples by omitting
the inner "SELECT" keyword, which is useless anyway.  Perhaps it is
needed in your real query though ...

            regards, tom lane
On Fri, Apr 24, 2009 at 5:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Daniel Grace <dgrace@wingsnw.com> writes:
> > The following nonsensical query causes PostgreSQL to fail with ERROR:
> plan
> > should not reference subplan's variable.  (This was stripped down from an
> > 'useful' query that triggered the same bug).  First encountered on 8.3.4,
> > reproduced on 8.3.7
>
> Hmmm ... I guess something is getting confused about the level of query
> nesting.  FWIW, you can avoid the bug in these two examples by omitting
> the inner "SELECT" keyword, which is useless anyway.  Perhaps it is
> needed in your real query though ...
>
>                        regards, tom lane
>
It's required in my case to force the aggregate function to evaluate its
inputs in a set order.  I'm trying to replace MySQL's GROUP_CONCAT function,
including the ORDER BY option.

I had another variation (that did not use sub-SELECTs, but instead joining
something along the lines of joining (SELECT * FROM foo ORDER BY fname) AS
foo that partially worked -- however, it had the side effect (due to the
nature of the query) of having some duplicate data and not the type that
could be fixed simply by adding DISTINCT.

I'm not going to spam the list with all of the table definitions for the
real query, but I will paste it by itself to give a better idea of what I
was originally attempting:


SELECT
    s.fid, c.flags, c.id, c.title, cs.ut_start, cs.ut_end, cr.full_name,
    cal.title AS cancel_reason,
    (SELECT GROUP_CONCAT((SELECT s2.fname FROM student AS s2 WHERE s2.id=
s.id ORDER BY fname), '; ')) AS students,
    (SELECT GROUP_CONCAT((SELECT p.gname FROM course_teacher AS ct INNER
JOIN person AS p ON ct.tid=p.id WHERE ct.cid=c.id ORDER BY p.gname), '; '))
AS teacher
FROM
    student AS s
    INNER JOIN student_course_session AS scs ON scs.sid=s.id
    INNER JOIN course_session AS cs ON cs.id=scs.csid
    INNER JOIN course AS c ON c.id=cs.cid
    LEFT JOIN course_room AS cr ON cr.id=c.room_id
    LEFT JOIN calendar AS cal ON cal.id=cs.cancelled_by
GROUP BY s.fid, cs.id, c.flags, c.id, c.title, cs.ut_start, cs.ut_end,
cr.full_name, cal.title, cs.tsstart
ORDER BY s.fid, cs.tsstart, c.title;

-- Daniel Grace
Daniel Grace <dgrace@wingsnw.com> writes:
> On Fri, Apr 24, 2009 at 5:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Hmmm ... I guess something is getting confused about the level of query
>> nesting.  FWIW, you can avoid the bug in these two examples by omitting
>> the inner "SELECT" keyword, which is useless anyway.  Perhaps it is
>> needed in your real query though ...

> It's required in my case to force the aggregate function to evaluate its
> inputs in a set order.

Really?  Your example doesn't show that.

> SELECT
>     s.fid, c.flags, c.id, c.title, cs.ut_start, cs.ut_end, cr.full_name,
>     cal.title AS cancel_reason,
>     (SELECT GROUP_CONCAT((SELECT s2.fname FROM student AS s2 WHERE s2.id=
       ------
> s.id ORDER BY fname), '; ')) AS students,

I'm talking about the underlined SELECT, not the one inside the
aggregate.  AFAICS this one is totally useless.  If it were useful,
we'd likely have found this issue long ago ...

            regards, tom lane
I wrote:
> I'm talking about the underlined SELECT, not the one inside the
> aggregate.  AFAICS this one is totally useless.

Oh, wait.  It is useless in the query as written, but now that I think
twice about what you're trying to accomplish, you do need three levels
of SELECT keywords.  Just not like that.  I think what you actually
want is

SELECT
    ...
    (SELECT GROUP_CONCAT(t.fname, '; ') FROM
      (SELECT s2.fname FROM student AS s2
       WHERE s2.id=s.id ORDER BY fname) AS t) AS students,
    ...
FROM
    student AS s

What you wrote instead is just wrong --- it would fail if there were
multiple students with the same id (can that actually happen?
Maybe there's more wrong with this query...), because what you
wrote is a scalar sub-SELECT inside an aggregate call that belongs
to the outermost query.

            regards, tom lane
No luck there either (runs, but with incorrect results), but since I know
this isn't a support list and is a bugs list I just would like to point out
that: Even though what I was doing that triggered the bug is apparently
incorrect and 'silly', it's still possible that some complicated legitimate
query might trigger the same problem -- so it may be worth looking into.

On Fri, Apr 24, 2009 at 7:19 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> I wrote:
> > I'm talking about the underlined SELECT, not the one inside the
> > aggregate.  AFAICS this one is totally useless.
>
> Oh, wait.  It is useless in the query as written, but now that I think
> twice about what you're trying to accomplish, you do need three levels
> of SELECT keywords.  Just not like that.  I think what you actually
> want is
>
> SELECT
>    ...
>    (SELECT GROUP_CONCAT(t.fname, '; ') FROM
>       (SELECT s2.fname FROM student AS s2
>        WHERE s2.id=s.id ORDER BY fname) AS t) AS students,
>    ...
> FROM
>    student AS s
>
> What you wrote instead is just wrong --- it would fail if there were
> multiple students with the same id (can that actually happen?
> Maybe there's more wrong with this query...), because what you
> wrote is a scalar sub-SELECT inside an aggregate call that belongs
> to the outermost query.
>
>                        regards, tom lane
>



--
Daniel Grace
AGE, LLC
System Administrator and Software Developer
dgrace@wingsnw.com // (425)327-0079 // www.wingsnw.com
Daniel Grace <dgrace@wingsnw.com> writes:
> No luck there either (runs, but with incorrect results), but since I know
> this isn't a support list and is a bugs list I just would like to point out
> that: Even though what I was doing that triggered the bug is apparently
> incorrect and 'silly', it's still possible that some complicated legitimate
> query might trigger the same problem -- so it may be worth looking into.

Yes, it's certainly a real bug, and we appreciate the bug report.  My
point is just that the reason it's gone undetected is that it's not a
very useful case in practice; and in particular I don't believe that
it's a case you need to solve your problem.

(To be concrete, the failure case as I understand it is a sub-SELECT
containing an aggregate call containing a sub-sub-SELECT, where the
aggregate call actually belongs to the outermost query level according
to the SQL nesting rules.  This is not what you want because the
aggregation is happening, or would happen except for the planner bug,
with entirely the wrong scope for what you want.)

            regards, tom lane
Daniel Grace <dgrace@wingsnw.com> writes:
> The following nonsensical query causes PostgreSQL to fail with ERROR: plan
> should not reference subplan's variable.  (This was stripped down from an
> 'useful' query that triggered the same bug).  First encountered on 8.3.4,
> reproduced on 8.3.7

Patch is here:
http://archives.postgresql.org/pgsql-committers/2009-04/msg00277.php

I still think that it won't affect you once you have the query logic
straight, though.

            regards, tom lane
On Sat, Apr 25, 2009 at 9:52 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Daniel Grace <dgrace@wingsnw.com> writes:
> > The following nonsensical query causes PostgreSQL to fail with ERROR:
> plan
> > should not reference subplan's variable.  (This was stripped down from an
> > 'useful' query that triggered the same bug).  First encountered on 8.3.4,
> > reproduced on 8.3.7
>
> Patch is here:
> http://archives.postgresql.org/pgsql-committers/2009-04/msg00277.php
>
> I still think that it won't affect you once you have the query logic
> straight, though.
>
Thanks for the info.  I've since fixed the query, so you are right in that
regard ;)

-- Daniel Grace