Обсуждение: SELECT DISTINCT ON removes results

Поиск
Список
Период
Сортировка

SELECT DISTINCT ON removes results

От
Guyren Howe
Дата:
Using 9.5, this query:

SELECT o.id,
a.number AS awb
FROM pt.orders o
LEFT JOIN (
SELECT DISTINCT ON ((string_agg(air_way_bills.number::text, ','::text)))
string_agg(air_way_bills.number::text, ','::text) AS number,
air_way_bills.order_id
FROM pt.air_way_bills
GROUP BY air_way_bills.order_id) a ON a.order_id = o.id
gives me null for awb. Removing the DISTINCT ON clause:

SELECT o.id,
a.number AS awb
FROM pt.orders o
LEFT JOIN (
SELECT string_agg(air_way_bills.number::text, ','::text) AS number,
air_way_bills.order_id
FROM pt.air_way_bills
GROUP BY air_way_bills.order_id) a ON a.order_id = o.id
where o.id = 2792;
gives me an awb. I'm confused about how this can be.

Re: SELECT DISTINCT ON removes results

От
"David G. Johnston"
Дата:
On Fri, Oct 28, 2016 at 1:39 PM, Guyren Howe <guyren@gmail.com> wrote:
Using 9.5, this query:

SELECT o.id,
a.number AS awb
FROM pt.orders o
LEFT JOIN (
SELECT DISTINCT ON ((string_agg(air_way_bills.number::text, ','::text)))
string_agg(air_way_bills.number::text, ','::text) AS number,
air_way_bills.order_id
FROM pt.air_way_bills
GROUP BY air_way_bills.order_id) a ON a.order_id = o.id
gives me null for awb. Removing the DISTINCT ON clause:

SELECT o.id,
a.number AS awb
FROM pt.orders o
LEFT JOIN (
SELECT string_agg(air_way_bills.number::text, ','::text) AS number,
air_way_bills.order_id
FROM pt.air_way_bills
GROUP BY air_way_bills.order_id) a ON a.order_id = o.id
where o.id = 2792;
gives me an awb. I'm confused about how this can be.


​On its face the statement "DISTINCT ON removes results" is not at all surprising given its definition.

Given a self-contained query exhibiting the desired behavior I might be willing to figure out and explain exactly why its happening in that particular circumstance.

Oh, and on its face your DISTINCT ON query doesn't make any sense to me.  Using DISTINCT ON on one column but then joining on the discard-able ID column is...unusual.

And it also lacks an ORDER BY for deterministic discarding of duplicate rows.

David J.

Re: SELECT DISTINCT ON removes results

От
Guyren Howe
Дата:
On Oct 28, 2016, at 13:50 , David G. Johnston <david.g.johnston@gmail.com> wrote:

​On its face the statement "DISTINCT ON removes results" is not at all surprising given its definition.

What is surprising is that it removes *all* results…


Re: SELECT DISTINCT ON removes results

От
Guyren Howe
Дата:
On Oct 28, 2016, at 14:15 , Guyren Howe <guyren@gmail.com> wrote:

On Oct 28, 2016, at 13:50 , David G. Johnston <david.g.johnston@gmail.com> wrote:

​On its face the statement "DISTINCT ON removes results" is not at all surprising given its definition.

What is surprising is that it removes *all* results…

The inner query on its own appears to produce the same results, but the left join with it in either case produces different results.

Re: SELECT DISTINCT ON removes results

От
Guyren Howe
Дата:

On Oct 28, 2016, at 13:50 , David G. Johnston <david.g.johnston@gmail.com> wrote:

On Fri, Oct 28, 2016 at 1:39 PM, Guyren Howe <guyren@gmail.com> wrote:
Using 9.5, this query:

Unless I'm missing something, this ought to be impossible.

Two queries differing only in having a DISTINCT clause produce the same result, demonstrated by EXCEPT:
=> SELECT DISTINCT ON ((string_agg(air_way_bills.number::text, ','::text)))
-> string_agg(air_way_bills.number::text, ','::text) AS number,
-> air_way_bills.order_id
-> FROM pt.air_way_bills
-> where air_way_bills.order_id = 2792
-> GROUP BY air_way_bills.order_id
-> except
-> select string_agg(air_way_bills.number::text, ','::text) AS number,
-> air_way_bills.order_id
-> FROM pt.air_way_bills
-> where air_way_bills.order_id = 2792
-> GROUP BY air_way_bills.order_id;
number | order_id
--------+----------
(0 rows)

but joining with them produces different results:
=> SELECT o.id,
-> a.number AS awb
-> FROM pt.orders o
-> LEFT JOIN (
(> SELECT
(> string_agg(air_way_bills.number::text, ','::text) AS number,
(> air_way_bills.order_id
(> FROM pt.air_way_bills
(> GROUP BY air_way_bills.order_id) a ON a.order_id = o.id
-> where o.id = 2792;
id | awb
------+----------
2792 | 91484540
(1 row)

=> SELECT o.id,
-> a.number AS awb
-> FROM pt.orders o
-> LEFT JOIN (
(> SELECT DISTINCT ON ((string_agg(air_way_bills.number::text, ','::text)))
(> string_agg(air_way_bills.number::text, ','::text) AS number,
(> air_way_bills.order_id
(> FROM pt.air_way_bills
(> GROUP BY air_way_bills.order_id) a ON a.order_id = o.id
-> where o.id = 2792
  id  | awb 
------+-----
 2792 | 
(1 row)

Re: SELECT DISTINCT ON removes results

От
"David G. Johnston"
Дата:


On Fri, Oct 28, 2016 at 2:17 PM, Guyren Howe <guyren@gmail.com> wrote:
On Oct 28, 2016, at 14:15 , Guyren Howe <guyren@gmail.com> wrote:

On Oct 28, 2016, at 13:50 , David G. Johnston <david.g.johnston@gmail.com> wrote:

​On its face the statement "DISTINCT ON removes results" is not at all surprising given its definition.

What is surprising is that it removes *all* results…

The inner query on its own appears to produce the same results, but the left join with it in either case produces different results.

​Then maybe you can help us and create reproducible example that demonstrates the bug so that it can be fixed.

David J.
 

Re: SELECT DISTINCT ON removes results

От
Geoff Winkless
Дата:
On 28 October 2016 at 21:39, Guyren Howe <guyren@gmail.com> wrote:
Using 9.5, this query:

SELECT o.id,
a.number AS awb
FROM pt.orders o
LEFT JOIN (
SELECT DISTINCT ON ((string_agg(air_way_bills.number::text, ','::text)))
string_agg(air_way_bills.number::text, ','::text) AS number,
air_way_bills.order_id
FROM pt.air_way_bills
GROUP BY air_way_bills.order_id) a ON a.order_id = o.id
The DISTINCT ON expression(s) must match the leftmost ORDER BY expression(s). The ORDER BY clause will normally contain additional expression(s) that determine the desired precedence of rows within each DISTINCT ON group.
 

​Does the problem go away if you do that?​

​Geoff​