[Question] Window Function Results without ORDER BY Clause
От | Zhang Mingli |
---|---|
Тема | [Question] Window Function Results without ORDER BY Clause |
Дата | |
Msg-id | fbb5c0d7-4a96-4dd1-9a26-5dfccfac667a@Spark обсуждение исходный текст |
Ответы |
Re: [Question] Window Function Results without ORDER BY Clause
|
Список | pgsql-hackers |
Hi,
I am reaching out to discuss the behavior of window functions in Postgres, specifically regarding the use of the
In our recent tests, we observed that the results can be unstable.
For example, when executing the following query:
SELECT sum(unique1) OVER (ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING),
unique1, four
FROM tenk1
WHERE unique1 < 10;
The case is in window.sql of regression.
explain(costs off) SELECT sum(unique1) over (rows between current row and unbounded following),
unique1, four
FROM tenk1 WHERE unique1 < 10;
QUERY PLAN
--------------------------------------------------------------------
WindowAgg
Window: w1 AS (ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
-> Index Scan using tenk1_unique1 on tenk1
Index Cond: (unique1 < 10)
(4 rows)
regression=# SELECT sum(unique1) over (rows between current row and unbounded following),
unique1, four
FROM tenk1 WHERE unique1 < 10;
sum | unique1 | four
-----+---------+------
45 | 0 | 0
45 | 1 | 1
44 | 2 | 2
42 | 3 | 3
39 | 4 | 0
35 | 5 | 1
30 | 6 | 2
24 | 7 | 3
17 | 8 | 0
9 | 9 | 1
(10 rows)
However, after setting
regression=# set enable_indexscan = off;
SET
regression=# SELECT sum(unique1) over (rows between current row and unbounded following),
unique1, four
FROM tenk1 WHERE unique1 < 10;
sum | unique1 | four
-----+---------+------
45 | 4 | 0
41 | 2 | 2
39 | 1 | 1
38 | 6 | 2
32 | 9 | 1
23 | 8 | 0
15 | 5 | 1
10 | 3 | 3
7 | 7 | 3
0 | 0 | 0
(10 rows)
regression=# explain(costs off) SELECT sum(unique1) over (rows between current row and unbounded following),
unique1, four
FROM tenk1 WHERE unique1 < 10;
QUERY PLAN
--------------------------------------------------------------------
WindowAgg
Window: w1 AS (ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
-> Seq Scan on tenk1
Filter: (unique1 < 10)
(4 rows)
Referring to the SQL 2011 standard, it states that if
While using a window function without
So, are both result sets technically correct given the absence of an
--
Zhang Mingli
HashData
I am reaching out to discuss the behavior of window functions in Postgres, specifically regarding the use of the
OVER()
clause without an ORDER BY
specification.In our recent tests, we observed that the results can be unstable.
For example, when executing the following query:
SELECT sum(unique1) OVER (ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING),
unique1, four
FROM tenk1
WHERE unique1 < 10;
The case is in window.sql of regression.
explain(costs off) SELECT sum(unique1) over (rows between current row and unbounded following),
unique1, four
FROM tenk1 WHERE unique1 < 10;
QUERY PLAN
--------------------------------------------------------------------
WindowAgg
Window: w1 AS (ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
-> Index Scan using tenk1_unique1 on tenk1
Index Cond: (unique1 < 10)
(4 rows)
regression=# SELECT sum(unique1) over (rows between current row and unbounded following),
unique1, four
FROM tenk1 WHERE unique1 < 10;
sum | unique1 | four
-----+---------+------
45 | 0 | 0
45 | 1 | 1
44 | 2 | 2
42 | 3 | 3
39 | 4 | 0
35 | 5 | 1
30 | 6 | 2
24 | 7 | 3
17 | 8 | 0
9 | 9 | 1
(10 rows)
However, after setting
enable_indexscan = off
, the results changed:regression=# set enable_indexscan = off;
SET
regression=# SELECT sum(unique1) over (rows between current row and unbounded following),
unique1, four
FROM tenk1 WHERE unique1 < 10;
sum | unique1 | four
-----+---------+------
45 | 4 | 0
41 | 2 | 2
39 | 1 | 1
38 | 6 | 2
32 | 9 | 1
23 | 8 | 0
15 | 5 | 1
10 | 3 | 3
7 | 7 | 3
0 | 0 | 0
(10 rows)
regression=# explain(costs off) SELECT sum(unique1) over (rows between current row and unbounded following),
unique1, four
FROM tenk1 WHERE unique1 < 10;
QUERY PLAN
--------------------------------------------------------------------
WindowAgg
Window: w1 AS (ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
-> Seq Scan on tenk1
Filter: (unique1 < 10)
(4 rows)
Referring to the SQL 2011 standard, it states that if
ORDER BY
is omitted, the order of rows in the partition is undefined. While using a window function without
ORDER BY
is valid, the resulting output seems unpredictable.So, are both result sets technically correct given the absence of an
ORDER BY
clause?--
Zhang Mingli
HashData
В списке pgsql-hackers по дате отправления: