Re: FIRST_VALUE: need to group by argument?

Поиск
Список
Период
Сортировка
От David G. Johnston
Тема Re: FIRST_VALUE: need to group by argument?
Дата
Msg-id CAKFQuwbGdu39FtWfD0itQbfDz9zWCMjcy+H+wcu1qBXiFRbzFQ@mail.gmail.com
обсуждение исходный текст
Ответ на FIRST_VALUE: need to group by argument?  (Guyren Howe <guyren@gmail.com>)
Ответы Re: FIRST_VALUE: need to group by argument?  (Manuel Gómez <targen@gmail.com>)
Список pgsql-general
On Mon, May 23, 2016 at 11:48 PM, Guyren Howe <guyren@gmail.com> wrote:
I am missing something here.

I have two tables:

orders
id

delivery_route_segments
id,
order_id,
position,
completed

I want to find the first uncompleted deliver_route_segment for each order, by position. Seems to me I ought to be able to do this:
SELECT
o.id,
FIRST_VALUE(drs.id)
FROM
orders o JOIN
delivery_route_segments drs ON (drs.order_id = o.id AND NOT drs.completed)
GROUP BY
o.id
but I'm told I need an over clause.



So I try this:
SELECT
o.id,
FIRST_VALUE(drs.id) OVER (ORDER BY position ASC)
FROM
orders o JOIN
delivery_route_segments drs ON (drs.order_id = o.id AND NOT drs.completed)
GROUP BY
o.id
here I'm told "drs.id must appear in the GROUP BY clause". This doesn't make sense to me; I shouldn't need to group by a value that's inside an aggregate function.



Tried this.
SELECT
o.id,
FIRST_VALUE(drs.id) OVER (PARTITION BY o.id ORDER BY position ASC)
FROM
orders o JOIN
delivery_route_segments drs ON (drs.order_id = o.id AND NOT drs.completed)
GROUP BY
o.id
but it has the same problem.

I can solve this with a subquery, but:
- I'd still like to know what's wrong; and
- I expect the subquery to be slower (yes?)
​Window functions don't provoke grouping.  That it their blessing.  The docs on them are pretty good.  first_value is strictly a window function (I may have missed this point when you first asked the question a few days ago)​.

SELECT i, sum(i) OVER () FROM ( VALUES (a,1), (a,2), (b,3) ) val (x,i)
yields
1, 6
2, 6
3, 6

You can see how all three rows are still present in the output - thus no grouping.  

SELECT x, sum(i), sum(sum(i)) OVER (PARTITION BY x) FROM ( VALUES (a,1), (a,2), (b,3) ) val (x,i) GROUP BY x
yields
a, 3, 6
b, 3, 6

Here you have to write sum(sum(i)) in the window expression since i is not in the group by and must first be aggregated somehow to satisfy the group by and then the corresponding window evaluation sums those aggregated i's.

The query you are looking for doesn't involve aggregate functions at all.

SELECT DISTINCT ON (order_id) order_id, drs.id
FROM delivery_route_segments drs
WHERE NOT drs.completed
ORDER BY order_id, position ASC

A semantically equivalent query using a window function would be.

WITH ordered_routes AS (
SELECT drs.order_id, drs.id AS drsid, row_number() OVER (PARTITION BY order_id ORDER BY position ASC) AS order_row
FROM delivery_route_segments drs
)
​SELECT order_id, drsid
FROM ordered_routes
WHERE order_row = 1;

As I mentioned in the other thread if you are using group by to simply remove duplicates introduced by a join, like you are here, you are probably doing something wrong.​

David J.

В списке pgsql-general по дате отправления:

Предыдущее
От: "Rader, David"
Дата:
Сообщение: Re: Postgresql-fdw
Следующее
От: Umair Shahid
Дата:
Сообщение: Members in the Middle East?