Обсуждение: Strange result with "SELECT ... ORDER BY random() LIMIT 1" and JOINs

Поиск
Список
Период
Сортировка

Strange result with "SELECT ... ORDER BY random() LIMIT 1" and JOINs

От
Etienne Dube
Дата:
Hello,

Consider the following tables and data:

----------------------------------------
CREATE TABLE color (
     color_id integer PRIMARY KEY,
     color_name text
);

INSERT INTO color (color_id, color_name)
VALUES
     (1, 'red'),
     (2, 'blue'),
     (3, 'green'),
     (4, 'yellow'),
     (5, 'grey'),
     (6, 'brown'),
     (7, 'black'),
     (8, 'white'),
     (9, 'white with wooden panels')
;

CREATE TABLE car (
     car_id integer PRIMARY KEY,
     car_name text
);

INSERT INTO car (car_id, car_name)
VALUES
     (1, 'Toyota Matrix'),
     (2, 'Mazda 3'),
     (3, 'Honda Fit'),
     (4, 'Ford F-150'),
     (5, 'Chevrolet Volt'),
     (6, 'Audi A4'),
     (7, 'Hyundai Elantra'),
     (8, 'Nissan Versa'),
     (9, 'Buick Estate Wagon')
;
----------------------------------------

This query yields unexpected results (tested under 9.2.4):

SELECT
     s.car_id,
     s.color_id AS subquery_color_id,
     co.color_id AS join_color_id,
     co.color_name
FROM
     (
         SELECT
             ca.car_id,
             (
                 SELECT color_id
                 FROM color
                 WHERE ca.car_id = ca.car_id  -- dependency added to
avoid getting the same value for every row in the output
                 ORDER BY random()
                 LIMIT 1
             ) AS color_id
         FROM
             car ca
     ) s
     LEFT JOIN color co ON co.color_id = s.color_id;

We can see the equality defined in the LEFT JOIN does not hold true for
the subquery_color_id and join_color_id column aliases in the output.
EXPLAIN also shows that the subplan for the inner subquery used to pick
a random row from the color table appears twice.

I don't really understand what is going on there, the result appears
incorrect to me. The following page seems to offer some explanations as
to what is happening:
http://sql-info.de/postgresql/postgres-gotchas.html#1_6 . Indeed, adding
OFFSET 0 to the "s" subquery solves the issue.

Can somebody shed some light on this topic? Is this behaviour correct or
should it be considered a bug?

Thanks!
Etienne



Re: Strange result with "SELECT ... ORDER BY random() LIMIT 1" and JOINs

От
Adrian Klaver
Дата:
On 08/15/2013 11:46 AM, Etienne Dube wrote:
> Hello,

>
> I don't really understand what is going on there, the result appears
> incorrect to me. The following page seems to offer some explanations as
> to what is happening:
> http://sql-info.de/postgresql/postgres-gotchas.html#1_6 . Indeed, adding
> OFFSET 0 to the "s" subquery solves the issue.
>
> Can somebody shed some light on this topic? Is this behaviour correct or
> should it be considered a bug?

See here for explanation:

http://www.postgresql.org/message-id/8569.1128439517@sss.pgh.pa.us

>
> Thanks!
> Etienne
>
>
>


--
Adrian Klaver
adrian.klaver@gmail.com


Re: Strange result with "SELECT ... ORDER BY random() LIMIT 1" and JOINs

От
Tom Lane
Дата:
Etienne Dube <etdube@gmail.com> writes:
> This query yields unexpected results (tested under 9.2.4):

> SELECT
>      s.car_id,
>      s.color_id AS subquery_color_id,
>      co.color_id AS join_color_id,
>      co.color_name
> FROM
>      (
>          SELECT
>              ca.car_id,
>              (
>                  SELECT color_id
>                  FROM color
>                  WHERE ca.car_id = ca.car_id  -- dependency added to
> avoid getting the same value for every row in the output
>                  ORDER BY random()
>                  LIMIT 1
>              ) AS color_id
>          FROM
>              car ca
>      ) s
>      LEFT JOIN color co ON co.color_id = s.color_id;

> We can see the equality defined in the LEFT JOIN does not hold true for
> the subquery_color_id and join_color_id column aliases in the output.
> EXPLAIN also shows that the subplan for the inner subquery used to pick
> a random row from the color table appears twice.

I've committed patches to prevent duplication of subplans containing
volatile functions.  Thanks for the test case!

            regards, tom lane