Обсуждение: PostgreSQL query with FULL OUTER JOIN and subquery using column of GROUP BY expression ERROR: subquery uses ungrouped column from outer query

Поиск
Список
Период
Сортировка
Hello,

I have 3 tables : t1, t2 and t3
t1(posx integer, posy integer, dat timestamp, val_grid numeric)
  with Primary Key=(posx, posy, dat)
t2(id integer, x integer, y integer)
  with Primary Key=(id)
t3(id integer, dat timestamp, c1 numeric)
  with Primary Key=(id, dat)

My (very simplified) query  that doesn't work:
SELECT
  id, to_char(dat, 'YYYYMM'), CASE FLOOR((TO_NUMBER(TO_CHAR(dat,'DD'),'99')-1)/10) WHEN 0 THEN 1 WHEN 1 THEN 2 ELSE 3 END::smallint,
avg(val_grid), sum(c1),
  (SELECT col1 FROM another_table WHERE another_table.np=id LIMIT 1)
FROM
  (SELECT * FROM t1 ) AS t1
  JOIN t2  ON (t1.posx=t2.x AND t1.posy=t2.y)
  FULL OUTER JOIN (SELECT * FROM t3   ) t3 USING(id, dat)
--WHERE ...
GROUP BY id, to_char(dat, 'YYYYMM'), CASE FLOOR((TO_NUMBER(TO_CHAR(dat,'DD'),'99')-1)/10) WHEN 0 THEN 1 WHEN 1 THEN 2 ELSE 3 END::smallint
;
ERROR:  subquery uses ungrouped column "t2.id" from outer query

-> The problem is on the third line (the subquery) :
(SELECT col1 FROM another_table WHERE another_table.np=id LIMIT 1)

However, If I replace "FULL OUTER" by "LEFT OUTER" or by "RIGHT OUTER", it works.

Could you please tell me how to correct the query, to fix this problem?

Regards
----- Météo-France -----
PALAYRET Jacques
DCSC/GDC
jacques.palayret@meteo.fr
Fixe : +33 561078319
On 2/25/26 3:14 AM, PALAYRET Jacques wrote:
> Hello,
> 
> I have 3 tables : t1, t2 and t3
> t1(posx integer, posy integer, dat timestamp, val_grid numeric)
>    with Primary Key=(posx, posy, dat)
> t2(id integer, x integer, y integer)
>    with Primary Key=(id)
> t3(id integer, dat timestamp, c1 numeric)
>    with Primary Key=(id, dat)
> 
> My (very simplified) query  that doesn't work:
> SELECT
>    id, to_char(dat, 'YYYYMM'), CASE 
> FLOOR((TO_NUMBER(TO_CHAR(dat,'DD'),'99')-1)/10) WHEN 0 THEN 1 WHEN 1 
> THEN 2 ELSE 3 END::smallint,
> avg(val_grid), sum(c1),
> *(SELECT col1 FROM another_table WHERE another_table.np=id LIMIT 1)*
> FROM
>    (SELECT * FROM t1 ) AS t1
> *JOIN* t2  ON (t1.posx=t2.x AND t1.posy=t2.y)
> *FULL OUTER JOIN* (SELECT * FROM t3   ) t3 USING(id, dat)
> --WHERE ...
> *GROUP BY* id, to_char(dat, 'YYYYMM'), CASE 
> FLOOR((TO_NUMBER(TO_CHAR(dat,'DD'),'99')-1)/10) WHEN 0 THEN 1 WHEN 1 
> THEN 2 ELSE 3 END::smallint
> ;
> ERROR:  subquery uses ungrouped column "t2.id" from outer query
> 
> -> The problem is on the third line (the subquery) :
> (SELECT col1 FROM another_table WHERE another_table.np=id LIMIT 1)
> 
> However, If I replace "FULL OUTER" by "*LEFT OUTER*" or by "*RIGHT 
> OUTER*", *it works*.
> 
> Could you please tell me how to correct the query, to fix this problem?

Questions:

1) Why FROM (SELECT * FROM t1 ) AS t1 instead of just FROM t1?
    Same for JOIN (SELECT * FROM t3   ) t3

2) Why are the field names not table qualified e.g. t2.id, t3.id, etc?

3) What is the desired outcome?
    
> 
> Regards
> ----- Météo-France -----
> PALAYRET Jacques
> DCSC/GDC
> jacques.palayret@meteo.fr
> Fixe : +33 561078319


-- 
Adrian Klaver
adrian.klaver@aklaver.com



hello again,

It seems that following can be a solution (adding sub-query " SELECT * FROM " after the main FROM clause)  :
SELECT
  id, to_char(dat, 'YYYYMM'), CASE FLOOR((TO_NUMBER(TO_CHAR(dat,'DD'),'99')-1)/10) WHEN 0 THEN 1 WHEN 1 THEN 2 ELSE 3 END::smallint,
avg(val_grid), sum(c1),
  (SELECT col1 FROM another_table WHERE another_table.np=id LIMIT 1)
FROM (
SELECT * FROM (
  (SELECT * FROM t1 ) AS t1
  JOIN t2  ON (t1.posx=t2.x AND t1.posy=t2.y)
  FULL OUTER JOIN (SELECT * FROM t3   ) t3 USING(id, dat)
) tb
) t
--WHERE ...
GROUP BY id, to_char(dat, 'YYYYMM'), CASE FLOOR((TO_NUMBER(TO_CHAR(dat,'DD'),'99')-1)/10) WHEN 0 THEN 1 WHEN 1 THEN 2 ELSE 3 END::smallint
;

Regards

De: "PALAYRET Jacques" <jacques.palayret@meteo.fr>
À: pgsql-general@lists.postgresql.org
Envoyé: Mercredi 25 Février 2026 12:14:28
Objet: PostgreSQL query with FULL OUTER JOIN and subquery using column of GROUP BY expression ERROR:  subquery uses ungrouped column from outer query

Hello,

I have 3 tables : t1, t2 and t3
t1(posx integer, posy integer, dat timestamp, val_grid numeric)
  with Primary Key=(posx, posy, dat)
t2(id integer, x integer, y integer)
  with Primary Key=(id)
t3(id integer, dat timestamp, c1 numeric)
  with Primary Key=(id, dat)

My (very simplified) query  that doesn't work:
SELECT
  id, to_char(dat, 'YYYYMM'), CASE FLOOR((TO_NUMBER(TO_CHAR(dat,'DD'),'99')-1)/10) WHEN 0 THEN 1 WHEN 1 THEN 2 ELSE 3 END::smallint,
avg(val_grid), sum(c1),
  (SELECT col1 FROM another_table WHERE another_table.np=id LIMIT 1)
FROM
  (SELECT * FROM t1 ) AS t1
  JOIN t2  ON (t1.posx=t2.x AND t1.posy=t2.y)
  FULL OUTER JOIN (SELECT * FROM t3   ) t3 USING(id, dat)
--WHERE ...
GROUP BY id, to_char(dat, 'YYYYMM'), CASE FLOOR((TO_NUMBER(TO_CHAR(dat,'DD'),'99')-1)/10) WHEN 0 THEN 1 WHEN 1 THEN 2 ELSE 3 END::smallint
;
ERROR:  subquery uses ungrouped column "t2.id" from outer query

-> The problem is on the third line (the subquery) :
(SELECT col1 FROM another_table WHERE another_table.np=id LIMIT 1)

However, If I replace "FULL OUTER" by "LEFT OUTER" or by "RIGHT OUTER", it works.

Could you please tell me how to correct the query, to fix this problem?

Regards
----- Météo-France -----
PALAYRET Jacques
DCSC/GDC
jacques.palayret@meteo.fr
Fixe : +33 561078319

On Wed, Feb 25, 2026 at 9:00 AM Adrian Klaver <adrian.klaver@aklaver.com> wrote:
1) Why FROM (SELECT * FROM t1 ) AS t1 instead of just FROM t1?
    Same for JOIN (SELECT * FROM t3   ) t3

Seems immaterial.  It's just a consequence of simplifying the original problem query. If it is consequential that would seem buggy.


2) Why are the field names not table qualified e.g. t2.id, t3.id, etc?

Because of the USING clause, those references shouldn't exist in the main query.


3) What is the desired outcome?

For the engine to realize the 'id' in the group by and the 'id' in the scalar subquery are the same 'id' - the one produced by the USING clause, not either of the t2 or t3 ids.

My first impression is that this is a bug.  Especially since the query apparently executes in both left-join and right-join modes.

David J.

Answers :
Note that, of course, the queries I provide are greatly simplified to give the ideas.
1) (SELECT * FROM t1 ) ---> It means the real SQL query for t1 comes from a subquery
2) Why are the field names not table qualified e.g. t2.id, t3.id, etc? ---> because of the USING
3) An exemple, with  sum(COALESCE(c1, val_grid))  in the select_list  :

TABLE t1 ;
 posx | posy |         dat         | val_grid
------+------+---------------------+----------
    1 |    2 | 2026-01-01 00:00:00 |      123
    2 |    1 | 2026-01-01 00:00:00 |       21
    2 |    1 | 2026-01-03 00:00:00 |       21
(3 lignes)

TABLE t2 ;
 id | x | y
----+---+---
 21 | 2 | 1
 12 | 1 | 2
(2 lignes)

TABLE t3 ;
 id |         dat         | c1  | c2
----+---------------------+-----+----
 12 | 2026-01-01 00:00:00 | 129 |  0
 21 | 2026-01-01 00:00:00 | 219 |  0
 21 | 2026-01-21 00:00:00 | 211 |  0
 21 | 2026-01-09 00:00:00 | 211 |  0
(4 lignes)

SELECT
  id, to_char(dat, 'YYYYMM'), CASE FLOOR((TO_NUMBER(TO_CHAR(dat,'DD'),'99')-1)/10) WHEN 0 THEN 1 WHEN 1 THEN 2 ELSE 3
END::smallint,
avg(val_grid), sum(COALESCE(c1, val_grid)),
  (SELECT col1 FROM another_table WHERE another_table.np=id LIMIT 1)
, count(*)
FROM (
SELECT * FROM (
  (SELECT * FROM t1 ) AS t1
  JOIN t2  ON (t1.posx=t2.x AND t1.posy=t2.y)
  FULL OUTER JOIN (SELECT * FROM t3   ) t3 USING(id, dat)
) tb
) t
--WHERE ...
GROUP BY id, to_char(dat, 'YYYYMM'), CASE FLOOR((TO_NUMBER(TO_CHAR(dat,'DD'),'99')-1)/10) WHEN 0 THEN 1 WHEN 1 THEN 2
ELSE3 END::smallint 
;
 id | to_char | int2 |         avg          | sum | rr1 | count
----+---------+------+----------------------+-----+-----+-------
 21 | 202601  |    3 |                      | 211 |     |     1
 21 | 202601  |    1 |  21.0000000000000000 | 451 |     |     3
 12 | 202601  |    1 | 123.0000000000000000 | 129 |     |     1
(3 lignes)



Regards
----- Mail original -----
De: "Adrian Klaver" <adrian.klaver@aklaver.com>
À: "PALAYRET Jacques" <jacques.palayret@meteo.fr>, pgsql-general@lists.postgresql.org
Envoyé: Mercredi 25 Février 2026 17:00:03
Objet: Re: PostgreSQL query with FULL OUTER JOIN and subquery using column of GROUP BY expression ERROR: subquery uses
ungroupedcolumn from outer query 

Questions:

1) Why FROM (SELECT * FROM t1 ) AS t1 instead of just FROM t1?
    Same for JOIN (SELECT * FROM t3   ) t3

2) Why are the field names not table qualified e.g. t2.id, t3.id, etc?

3) What is the desired outcome?



--
Adrian Klaver
adrian.klaver@aklaver.com



"David G. Johnston" <david.g.johnston@gmail.com> writes:
> My first impression is that this is a bug.  Especially since the query
> apparently executes in both left-join and right-join modes.

Well, it's a longstanding deficiency anyway.  The problem is that
the full-join-using merged column "id" is represented as
"COALESCE(t2.id, t3.id)" not as a single Var.  That should be okay,
because we can handle grouping by an expression, but
substitute_grouped_columns (and check_ungrouped_columns before it)
doesn't handle the case in subqueries:

 * NOTE: we recognize grouping expressions in the main query, but only
 * grouping Vars in subqueries.  For example, this will be rejected,
 * although it could be allowed:
 *        SELECT
 *            (SELECT x FROM bar where y = (foo.a + foo.b))
 *        FROM foo
 *        GROUP BY a + b;
 * The difficulty is the need to account for different sublevels_up.
 * This appears to require a whole custom version of equal(), which is
 * way more pain than the feature seems worth.

This commentary dates to 2003 (and it's from a patch that replaced an
older implementation with the same limitation; it doesn't look to me
like the case ever worked).  AFAIR, the number of complaints we've
gotten about this limitation in the past 30 years could be counted
without running out of thumbs.

Still, it is annoying.  I wonder if there's a way to do it without
either a large amount of new code or exponential time spent
trying useless subexpression matches...

            regards, tom lane



I wrote:
> "David G. Johnston" <david.g.johnston@gmail.com> writes:
>> My first impression is that this is a bug.  Especially since the query
>> apparently executes in both left-join and right-join modes.

> Well, it's a longstanding deficiency anyway.
> ...
> Still, it is annoying.  I wonder if there's a way to do it without
> either a large amount of new code or exponential time spent
> trying useless subexpression matches...

I've posted a draft fix for that at

https://www.postgresql.org/message-id/flat/531183.1772058731%40sss.pgh.pa.us

            regards, tom lane