Обсуждение: PostgreSQL : bug (ou not) in CTE Common Table Expressions or Subqueries in the FROM Clause

Поиск
Список
Период
Сортировка

PostgreSQL : bug (ou not) in CTE Common Table Expressions or Subqueries in the FROM Clause

От
PALAYRET Jacques
Дата:
Hello,

# Let's consider a table defined as follows :
CREATE TABLE weather_stations(
id integer,
name varchar(30),
elev integer
) ;

# After loading, the content :
    id    |         name         | elev
----------+----------------------+------
 31069001 | TOULOUSE-BLAGNAC     |  151
 31006001 | ALBIAC AGGLOMERATION |  289
 31010001 | LUCHON-ANTIGNAC      |  599
 50003001 | AGON-COUTAINVILLE    |    2
 50195001 | GATHEMO              |  330
(5 lignes)

### With CTE :
# I'm suprised by the following result, the behavior of  PostgreSQL ; is that a bug ? :
= Statement 1 : =
WITH elev_Tlse_Blagnac AS (
SELECT elev FROM weather_stations WHERE id=31069001
)
SELECT id, name, elev FROM weather_stations
WHERE elev > (
SELECT elev FROM elev_Tlse_Blagnac WHERE id BETWEEN 31000000 and 31999999
) ;
    id    |         name         | elev
----------+----------------------+------
 31006001 | ALBIAC AGGLOMERATION |  289
 31010001 | LUCHON-ANTIGNAC      |  599
(2 lignes)
# According to me, the previous result is an error, because the parentheses are not taken into account.
The column id is not part of elev_Tlse_Blagnac.


# The same result as following, which is of course OK :
= Statement 2 : =
WITH elev_Tlse_Blagnac AS (
SELECT elev FROM weather_stations WHERE id=31069001
)
SELECT id, name, elev FROM weather_stations
WHERE elev > (
SELECT elev FROM elev_Tlse_Blagnac
)
AND id BETWEEN 31000000 and 31999999
;
    id    |         name         | elev
----------+----------------------+------
 31006001 | ALBIAC AGGLOMERATION |  289
 31010001 | LUCHON-ANTIGNAC      |  599
(2 lignes)


### Same weird behavior with subquery in FROM clause :
# NOT OK (according to me), because the parentheses are not taken into account :
= Statement 3 : =
SELECT id, name, elev FROM weather_stations
WHERE elev > (
SELECT elev FROM (SELECT elev FROM weather_stations WHERE id=31069001) elev_Tlse_Blagnac WHERE id BETWEEN 31000000 and 31999999
) ;
    id    |         name         | elev
----------+----------------------+------
 31006001 | ALBIAC AGGLOMERATION |  289
 31010001 | LUCHON-ANTIGNAC      |  599
(2 lignes)

# OK, the parentheses are taken into account because there is no confusion with the column id (elev_Tlse_Blagnac has a column named id) :
= Statement 4 : =
SELECT id, name, elev FROM weather_stations WHERE elev > (
SELECT elev FROM (SELECT * FROM weather_stations WHERE id=31069001) elev_Tlse_Blagnac WHERE id BETWEEN 31000000 and 31999999
) ;
    id    |         name         | elev
----------+----------------------+------
 31006001 | ALBIAC AGGLOMERATION |  289
 31010001 | LUCHON-ANTIGNAC      |  599
 50195001 | GATHEMO              |  330
(3 lignes)

# OK (of course) :
= Statement 5 : =
SELECT id, name, elev FROM weather_stations WHERE elev > (
SELECT elev FROM (SELECT * FROM weather_stations WHERE id=31069001) elev_Tlse_Blagnac
)
AND id BETWEEN 31000000 and 31999999
;
    id    |         name         | elev
----------+----------------------+------
 31006001 | ALBIAC AGGLOMERATION |  289
 31010001 | LUCHON-ANTIGNAC      |  599
(2 lignes)


Is that a PostgreSQL bug or not, statement 1 or statement 3 (yes according to me) ?


Regards
----- Météo-France -----
PALAYRET JACQUES
DCSC/GDC
jacques.palayret@meteo.fr
Fixe : +33 561078319

Re: PostgreSQL : bug (ou not) in CTE Common Table Expressions or Subqueries in the FROM Clause

От
Torsten Förtsch
Дата:
This is what happens:

WITH elev_Tlse_Blagnac AS (
   SELECT elev FROM weather_stations WHERE id=31069001
)
SELECT w.id, w.name, w.elev
 FROM weather_stations AS w
WHERE elev > (SELECT x.elev
                FROM elev_Tlse_Blagnac AS x
               WHERE w.id BETWEEN 31000000 and 31999999);
   id    |         name         | elev  
----------+----------------------+------
31006001 | ALBIAC AGGLOMERATION |  289
31010001 | LUCHON-ANTIGNAC      |  599
(2 rows)

Note the use of aliases, w and x. You are using a correlated subquery.

On Tue, Mar 22, 2022 at 10:46 AM PALAYRET Jacques <jacques.palayret@meteo.fr> wrote:
Hello,

# Let's consider a table defined as follows :
CREATE TABLE weather_stations(
id integer,
name varchar(30),
elev integer
) ;

# After loading, the content :
    id    |         name         | elev
----------+----------------------+------
 31069001 | TOULOUSE-BLAGNAC     |  151
 31006001 | ALBIAC AGGLOMERATION |  289
 31010001 | LUCHON-ANTIGNAC      |  599
 50003001 | AGON-COUTAINVILLE    |    2
 50195001 | GATHEMO              |  330
(5 lignes)

### With CTE :
# I'm suprised by the following result, the behavior of  PostgreSQL ; is that a bug ? :
= Statement 1 : =
WITH elev_Tlse_Blagnac AS (
SELECT elev FROM weather_stations WHERE id=31069001
)
SELECT id, name, elev FROM weather_stations
WHERE elev > (
SELECT elev FROM elev_Tlse_Blagnac WHERE id BETWEEN 31000000 and 31999999
) ;
    id    |         name         | elev
----------+----------------------+------
 31006001 | ALBIAC AGGLOMERATION |  289
 31010001 | LUCHON-ANTIGNAC      |  599
(2 lignes)
# According to me, the previous result is an error, because the parentheses are not taken into account.
The column id is not part of elev_Tlse_Blagnac.


# The same result as following, which is of course OK :
= Statement 2 : =
WITH elev_Tlse_Blagnac AS (
SELECT elev FROM weather_stations WHERE id=31069001
)
SELECT id, name, elev FROM weather_stations
WHERE elev > (
SELECT elev FROM elev_Tlse_Blagnac
)
AND id BETWEEN 31000000 and 31999999
;
    id    |         name         | elev
----------+----------------------+------
 31006001 | ALBIAC AGGLOMERATION |  289
 31010001 | LUCHON-ANTIGNAC      |  599
(2 lignes)


### Same weird behavior with subquery in FROM clause :
# NOT OK (according to me), because the parentheses are not taken into account :
= Statement 3 : =
SELECT id, name, elev FROM weather_stations
WHERE elev > (
SELECT elev FROM (SELECT elev FROM weather_stations WHERE id=31069001) elev_Tlse_Blagnac WHERE id BETWEEN 31000000 and 31999999
) ;
    id    |         name         | elev
----------+----------------------+------
 31006001 | ALBIAC AGGLOMERATION |  289
 31010001 | LUCHON-ANTIGNAC      |  599
(2 lignes)

# OK, the parentheses are taken into account because there is no confusion with the column id (elev_Tlse_Blagnac has a column named id) :
= Statement 4 : =
SELECT id, name, elev FROM weather_stations WHERE elev > (
SELECT elev FROM (SELECT * FROM weather_stations WHERE id=31069001) elev_Tlse_Blagnac WHERE id BETWEEN 31000000 and 31999999
) ;
    id    |         name         | elev
----------+----------------------+------
 31006001 | ALBIAC AGGLOMERATION |  289
 31010001 | LUCHON-ANTIGNAC      |  599
 50195001 | GATHEMO              |  330
(3 lignes)

# OK (of course) :
= Statement 5 : =
SELECT id, name, elev FROM weather_stations WHERE elev > (
SELECT elev FROM (SELECT * FROM weather_stations WHERE id=31069001) elev_Tlse_Blagnac
)
AND id BETWEEN 31000000 and 31999999
;
    id    |         name         | elev
----------+----------------------+------
 31006001 | ALBIAC AGGLOMERATION |  289
 31010001 | LUCHON-ANTIGNAC      |  599
(2 lignes)


Is that a PostgreSQL bug or not, statement 1 or statement 3 (yes according to me) ?


Regards
----- Météo-France -----
PALAYRET JACQUES
DCSC/GDC
jacques.palayret@meteo.fr
Fixe : +33 561078319

Re: PostgreSQL : bug (ou not) in CTE Common Table Expressions or Subqueries in the FROM Clause

От
Guillaume Lelarge
Дата:
Hi,

Le mar. 22 mars 2022 à 10:46, PALAYRET Jacques <jacques.palayret@meteo.fr> a écrit :
Hello,

# Let's consider a table defined as follows :
CREATE TABLE weather_stations(
id integer,
name varchar(30),
elev integer
) ;

# After loading, the content :
    id    |         name         | elev
----------+----------------------+------
 31069001 | TOULOUSE-BLAGNAC     |  151
 31006001 | ALBIAC AGGLOMERATION |  289
 31010001 | LUCHON-ANTIGNAC      |  599
 50003001 | AGON-COUTAINVILLE    |    2
 50195001 | GATHEMO              |  330
(5 lignes)

### With CTE :
# I'm suprised by the following result, the behavior of  PostgreSQL ; is that a bug ? :
= Statement 1 : =
WITH elev_Tlse_Blagnac AS (
SELECT elev FROM weather_stations WHERE id=31069001
)
SELECT id, name, elev FROM weather_stations
WHERE elev > (
SELECT elev FROM elev_Tlse_Blagnac WHERE id BETWEEN 31000000 and 31999999
) ;
    id    |         name         | elev
----------+----------------------+------
 31006001 | ALBIAC AGGLOMERATION |  289
 31010001 | LUCHON-ANTIGNAC      |  599
(2 lignes)
# According to me, the previous result is an error, because the parentheses are not taken into account.
The column id is not part of elev_Tlse_Blagnac.


Not a bug, just following the SQL standard as far as I remember.
 

# The same result as following, which is of course OK :
= Statement 2 : =
WITH elev_Tlse_Blagnac AS (
SELECT elev FROM weather_stations WHERE id=31069001
)
SELECT id, name, elev FROM weather_stations
WHERE elev > (
SELECT elev FROM elev_Tlse_Blagnac
)
AND id BETWEEN 31000000 and 31999999
;
    id    |         name         | elev
----------+----------------------+------
 31006001 | ALBIAC AGGLOMERATION |  289
 31010001 | LUCHON-ANTIGNAC      |  599
(2 lignes)


### Same weird behavior with subquery in FROM clause :
# NOT OK (according to me), because the parentheses are not taken into account :
= Statement 3 : =
SELECT id, name, elev FROM weather_stations
WHERE elev > (
SELECT elev FROM (SELECT elev FROM weather_stations WHERE id=31069001) elev_Tlse_Blagnac WHERE id BETWEEN 31000000 and 31999999
) ;
    id    |         name         | elev
----------+----------------------+------
 31006001 | ALBIAC AGGLOMERATION |  289
 31010001 | LUCHON-ANTIGNAC      |  599
(2 lignes)

# OK, the parentheses are taken into account because there is no confusion with the column id (elev_Tlse_Blagnac has a column named id) :
= Statement 4 : =
SELECT id, name, elev FROM weather_stations WHERE elev > (
SELECT elev FROM (SELECT * FROM weather_stations WHERE id=31069001) elev_Tlse_Blagnac WHERE id BETWEEN 31000000 and 31999999
) ;
    id    |         name         | elev
----------+----------------------+------
 31006001 | ALBIAC AGGLOMERATION |  289
 31010001 | LUCHON-ANTIGNAC      |  599
 50195001 | GATHEMO              |  330
(3 lignes)

# OK (of course) :
= Statement 5 : =
SELECT id, name, elev FROM weather_stations WHERE elev > (
SELECT elev FROM (SELECT * FROM weather_stations WHERE id=31069001) elev_Tlse_Blagnac
)
AND id BETWEEN 31000000 and 31999999
;
    id    |         name         | elev
----------+----------------------+------
 31006001 | ALBIAC AGGLOMERATION |  289
 31010001 | LUCHON-ANTIGNAC      |  599
(2 lignes)


Is that a PostgreSQL bug or not, statement 1 or statement 3 (yes according to me) ?

Not a bug, just following the SQL standard as far as I remember.


--

Re: PostgreSQL : bug (ou not) in CTE Common Table Expressions or Subqueries in the FROM Clause

От
"David G. Johnston"
Дата:
On Tuesday, March 22, 2022, Torsten Förtsch <tfoertsch123@gmail.com> wrote:

Note the use of aliases, w and x. You are using a correlated subquery.

Yep.  There is an FAQ entry if you’d like a bit more exposition.

 
David J.

Re: PostgreSQL : bug (ou not) in CTE Common Table Expressions or Subqueries in the FROM Clause

От
PALAYRET Jacques
Дата:
Thanks a lot.
Visibly, you are right.
It's a correlated statement, OK, right.

But in the subquery :
(
  SELECT x.elev
  FROM elev_Tlse_Blagnac AS x
  WHERE w.id BETWEEN 31000000 and 31999999
)
the WHERE clause (= w.id BETWEEN 31000000 and 31999999) is for the SELECT x.elev FROM elev_Tlse_Blagnac
which is the value 151 (one line, one value), correlation or not.
So, for me, it should NOT be a WHERE clause (a condition) for the MAIN statement.

According to me, there is only one condition in the main statement (SELECT w.id, w.name, w.elev FROM weather_stations ...)
and it is : elev > 151 (correlation ou not correlation).
In others words : for each line of table weather_stations), the only condition is : is the elev superior than the elev returned by the subquery, 151 ?

Visibly, the correlated statement adds one condition (w.id BETWEEN 31000000 and 31999999) in the main statement, but it's not logical for me, because of the parentheses.

From your point of view, it is the same statement than :
-----
WITH elev_Tlse_Blagnac AS (
   SELECT elev FROM weather_stations WHERE id=31069001
)
SELECT w.id, w.name, w.elev
 FROM weather_stations AS w
WHERE elev > (SELECT x.elev
                FROM elev_Tlse_Blagnac AS x)
AND w.id BETWEEN 31000000 and 31999999;
    id    |         name         | elev
----------+----------------------+------
 31006001 | ALBIAC AGGLOMERATION |  289
 31010001 | LUCHON-ANTIGNAC      |  599
(2 lignes)
-----

For me, it's weird, not logical.


Thanks again.
Regards

De: "Torsten Förtsch" <tfoertsch123@gmail.com>
À: "PALAYRET Jacques" <jacques.palayret@meteo.fr>
Cc: "PostgreSQL mailing lists" <pgsql-general@postgresql.org>
Envoyé: Mardi 22 Mars 2022 11:16:19
Objet: Re: PostgreSQL : bug (ou not) in CTE Common Table Expressions or Subqueries in the FROM Clause

This is what happens:
WITH elev_Tlse_Blagnac AS (
   SELECT elev FROM weather_stations WHERE id=31069001
)
SELECT w.id, w.name, w.elev
 FROM weather_stations AS w
WHERE elev > (SELECT x.elev
                FROM elev_Tlse_Blagnac AS x
               WHERE w.id BETWEEN 31000000 and 31999999);
   id    |         name         | elev  
----------+----------------------+------
31006001 | ALBIAC AGGLOMERATION |  289
31010001 | LUCHON-ANTIGNAC      |  599
(2 rows)

Note the use of aliases, w and x. You are using a correlated subquery.

On Tue, Mar 22, 2022 at 10:46 AM PALAYRET Jacques <jacques.palayret@meteo.fr> wrote:
Hello,

# Let's consider a table defined as follows :
CREATE TABLE weather_stations(
id integer,
name varchar(30),
elev integer
) ;

# After loading, the content :
    id    |         name         | elev
----------+----------------------+------
 31069001 | TOULOUSE-BLAGNAC     |  151
 31006001 | ALBIAC AGGLOMERATION |  289
 31010001 | LUCHON-ANTIGNAC      |  599
 50003001 | AGON-COUTAINVILLE    |    2
 50195001 | GATHEMO              |  330
(5 lignes)

### With CTE :
# I'm suprised by the following result, the behavior of  PostgreSQL ; is that a bug ? :
= Statement 1 : =
WITH elev_Tlse_Blagnac AS (
SELECT elev FROM weather_stations WHERE id=31069001
)
SELECT id, name, elev FROM weather_stations
WHERE elev > (
SELECT elev FROM elev_Tlse_Blagnac WHERE id BETWEEN 31000000 and 31999999
) ;
    id    |         name         | elev
----------+----------------------+------
 31006001 | ALBIAC AGGLOMERATION |  289
 31010001 | LUCHON-ANTIGNAC      |  599
(2 lignes)
# According to me, the previous result is an error, because the parentheses are not taken into account.
The column id is not part of elev_Tlse_Blagnac.


# The same result as following, which is of course OK :
= Statement 2 : =
WITH elev_Tlse_Blagnac AS (
SELECT elev FROM weather_stations WHERE id=31069001
)
SELECT id, name, elev FROM weather_stations
WHERE elev > (
SELECT elev FROM elev_Tlse_Blagnac
)
AND id BETWEEN 31000000 and 31999999
;
    id    |         name         | elev
----------+----------------------+------
 31006001 | ALBIAC AGGLOMERATION |  289
 31010001 | LUCHON-ANTIGNAC      |  599
(2 lignes)


### Same weird behavior with subquery in FROM clause :
# NOT OK (according to me), because the parentheses are not taken into account :
= Statement 3 : =
SELECT id, name, elev FROM weather_stations
WHERE elev > (
SELECT elev FROM (SELECT elev FROM weather_stations WHERE id=31069001) elev_Tlse_Blagnac WHERE id BETWEEN 31000000 and 31999999
) ;
    id    |         name         | elev
----------+----------------------+------
 31006001 | ALBIAC AGGLOMERATION |  289
 31010001 | LUCHON-ANTIGNAC      |  599
(2 lignes)

# OK, the parentheses are taken into account because there is no confusion with the column id (elev_Tlse_Blagnac has a column named id) :
= Statement 4 : =
SELECT id, name, elev FROM weather_stations WHERE elev > (
SELECT elev FROM (SELECT * FROM weather_stations WHERE id=31069001) elev_Tlse_Blagnac WHERE id BETWEEN 31000000 and 31999999
) ;
    id    |         name         | elev
----------+----------------------+------
 31006001 | ALBIAC AGGLOMERATION |  289
 31010001 | LUCHON-ANTIGNAC      |  599
 50195001 | GATHEMO              |  330
(3 lignes)

# OK (of course) :
= Statement 5 : =
SELECT id, name, elev FROM weather_stations WHERE elev > (
SELECT elev FROM (SELECT * FROM weather_stations WHERE id=31069001) elev_Tlse_Blagnac
)
AND id BETWEEN 31000000 and 31999999
;
    id    |         name         | elev
----------+----------------------+------
 31006001 | ALBIAC AGGLOMERATION |  289
 31010001 | LUCHON-ANTIGNAC      |  599
(2 lignes)


Is that a PostgreSQL bug or not, statement 1 or statement 3 (yes according to me) ?


Regards
----- Météo-France -----
PALAYRET JACQUES
DCSC/GDC
jacques.palayret@meteo.fr
Fixe : +33 561078319

Re: PostgreSQL : bug (ou not) in CTE Common Table Expressions or Subqueries in the FROM Clause

От
"David G. Johnston"
Дата:
On Tuesday, March 22, 2022, PALAYRET Jacques <jacques.palayret@meteo.fr> wrote:

According to me, there is only one condition in the main statement (SELECT w.id, w.name, w.elev FROM weather_stations ...)
and it is : elev > 151 (correlation ou not correlation).
In others words : for each line of table weather_stations), the only condition is : is the elev superior than the elev returned by the subquery, 151 ?

Visibly, the correlated statement adds one condition (w.id BETWEEN 31000000 and 31999999) in the main statement, but it's not logical for me, because of the parentheses.

I agree your example query is written poorly and thus is confusing.  But it is not possible for the system to distinguish a poorly written query from a goodly written one that uses the same functionality.  As the functionality is useful, and the parentheses simply don’t isolate the subquery in the manner you ascribe to them, you’ll just need to adapt to reality.  Its doesn’t have to seem logical to you, but this is how it is defined to work and thus the observed behavior is not a bug.

David J.

Re: PostgreSQL : bug (ou not) in CTE Common Table Expressions or Subqueries in the FROM Clause

От
PALAYRET Jacques
Дата:

Constrained and forced, I am obliged to accept the behavior which is a means of using useful functionalities (standard SQL a priori). It's instructive for me.

I don't know if it's the same behavior (more or less strict SQL standard) in other DBMS.
For example, in Oracle it is not possible to have a SELECT statement without a FROM clause (using DUAL table), so maybe " ambiguous " columns are not handled in the same way.

Anyway, thank you very much for the explanation.
Regards

De: "David G. Johnston" <david.g.johnston@gmail.com>
À: "PALAYRET Jacques" <jacques.palayret@meteo.fr>
Cc: "PostgreSQL mailing lists" <pgsql-general@postgresql.org>
Envoyé: Mardi 22 Mars 2022 14:04:32
Objet: Re: PostgreSQL : bug (ou not) in CTE Common Table Expressions or Subqueries in the FROM Clause

On Tuesday, March 22, 2022, PALAYRET Jacques <jacques.palayret@meteo.fr> wrote:

According to me, there is only one condition in the main statement (SELECT w.id, w.name, w.elev FROM weather_stations ...)
and it is : elev > 151 (correlation ou not correlation).
In others words : for each line of table weather_stations), the only condition is : is the elev superior than the elev returned by the subquery, 151 ?

Visibly, the correlated statement adds one condition (w.id BETWEEN 31000000 and 31999999) in the main statement, but it's not logical for me, because of the parentheses.

I agree your example query is written poorly and thus is confusing.  But it is not possible for the system to distinguish a poorly written query from a goodly written one that uses the same functionality.  As the functionality is useful, and the parentheses simply don’t isolate the subquery in the manner you ascribe to them, you’ll just need to adapt to reality.  Its doesn’t have to seem logical to you, but this is how it is defined to work and thus the observed behavior is not a bug.

David J.

Re: PostgreSQL : bug (ou not) in CTE Common Table Expressions or Subqueries in the FROM Clause

От
David Rowley
Дата:
On Wed, 23 Mar 2022 at 04:03, PALAYRET Jacques
<jacques.palayret@meteo.fr> wrote:
> For example, in Oracle it is not possible to have a SELECT statement without a FROM clause (using DUAL table), so
maybe" ambiguous " columns are not handled in the same way.
 

But if your original complaint, the column was not ambiguous. The CTE
had no column named "id".

I really highly recommend that you write queries giving each relation
a short alias then make a habit to always prefix your column names
with the alias.  If you don't do this then you can have all sorts of
problems when you one day want to start dropping old unused columns
out of your tables.  If you alias these then dropping the columns will
properly cause queries that reference these columns to ERROR.  Without
aliases, your queries might just start doing something you don't want
them to do and you might not realise that for a very long time.

David