Обсуждение: Sorting Discrepancy in PostgreSQL 14.13
==============================================
POSTGRESQL BUG REPORT TEMPLATE
==============================================
Your name : minji-kim
Your email address : hzuiw33@gmail.com
# System Configuration:
---------------------
Architecture (example: Intel Pentium) : Intel(R) Core(TM) Ultra 7 155H
Operating System (example: Linux 2.4.18) : VMware Workstation Pro (Ubuntu-22.04)
PostgreSQL version (example: PostgreSQL 9.6.6) : PostgreSQL 14.13
Compiler used (example: gcc 3.3.5) : X (sudo apt install postgresql postgresql-contrib)
# Please enter a FULL description of your problem:
------------------------------------------------
Sorting Discrepancy in PostgreSQL 14.13
When running the following command in PostgreSQL 14.13:
```sql
CREATE TABLE t0 (c0 TEXT);
INSERT INTO t0 (c0) VALUES ('-10'), ('20'), ('-5'), ('15'), ('-25');
SELECT c0, MIN(ABS(CAST(c0 AS BIGINT))) OVER (ORDER BY c0 NULLS FIRST) AS min_function_cast FROM t0;
DROP TABLE IF EXISTS t0;
```
The result is:
```
c0 | min_function_cast
-----+-------------------
-10 | 10
15 | 10
20 | 10
-25 | 10
-5 | 5
(5 rows)
```
However, in other DBMS (SQLite, MySQL, OracleDB) and PostgreSQL version 17.0, the output is:
```
c0 | min_function_cast
-----+-------------------
-10 | 10
-25 | 10
-5 | 5
15 | 5
20 | 5
(5 rows)
```
This discrepency is due to different string sorting orders.
The minimized PoC is:
```sql
CREATE TABLE t0 (c0 TEXT);
INSERT INTO t0 (c0) VALUES ('-10'), ('20'), ('-5'), ('15'), ('-25');
SELECT c0 FROM t0 ORDER BY c0;
```
In PostgreSQL 14.13, the order is incorrect:
```
-10
15
20
-25
-5
```
While the correct order should be:
```
-10
-25
-5
10
20
```
as '-' is smaller than '1', or '2' in ascii.
I'm doubtful this is a collation issue, as most collations basically respect ASCII order.
Even if this issue appears to be related to collation, no warnings are provided when migrating from this version.
# Please describe a way to repeat the problem.
# Please try to provide a concise reproducible example, if at all possible:
----------------------------------------------------------------------
Running the following command in PostgreSQL 14.13:
```sql
CREATE TABLE t0 (c0 TEXT);
INSERT INTO t0 (c0) VALUES ('-10'), ('20'), ('-5'), ('15'), ('-25');
SELECT c0, MIN(ABS(CAST(c0 AS BIGINT))) OVER (ORDER BY c0 NULLS FIRST) AS min_function_cast FROM t0;
DROP TABLE IF EXISTS t0;
```
Results in the incorrect sorting behavior.
If you know how this problem might be fixed, list the solution below:
---------------------------------------------------------------------
Using the correct sort criteria will solve this problem.
POSTGRESQL BUG REPORT TEMPLATE
==============================================
Your name : minji-kim
Your email address : hzuiw33@gmail.com
# System Configuration:
---------------------
Architecture (example: Intel Pentium) : Intel(R) Core(TM) Ultra 7 155H
Operating System (example: Linux 2.4.18) : VMware Workstation Pro (Ubuntu-22.04)
PostgreSQL version (example: PostgreSQL 9.6.6) : PostgreSQL 14.13
Compiler used (example: gcc 3.3.5) : X (sudo apt install postgresql postgresql-contrib)
# Please enter a FULL description of your problem:
------------------------------------------------
Sorting Discrepancy in PostgreSQL 14.13
When running the following command in PostgreSQL 14.13:
```sql
CREATE TABLE t0 (c0 TEXT);
INSERT INTO t0 (c0) VALUES ('-10'), ('20'), ('-5'), ('15'), ('-25');
SELECT c0, MIN(ABS(CAST(c0 AS BIGINT))) OVER (ORDER BY c0 NULLS FIRST) AS min_function_cast FROM t0;
DROP TABLE IF EXISTS t0;
```
The result is:
```
c0 | min_function_cast
-----+-------------------
-10 | 10
15 | 10
20 | 10
-25 | 10
-5 | 5
(5 rows)
```
However, in other DBMS (SQLite, MySQL, OracleDB) and PostgreSQL version 17.0, the output is:
```
c0 | min_function_cast
-----+-------------------
-10 | 10
-25 | 10
-5 | 5
15 | 5
20 | 5
(5 rows)
```
This discrepency is due to different string sorting orders.
The minimized PoC is:
```sql
CREATE TABLE t0 (c0 TEXT);
INSERT INTO t0 (c0) VALUES ('-10'), ('20'), ('-5'), ('15'), ('-25');
SELECT c0 FROM t0 ORDER BY c0;
```
In PostgreSQL 14.13, the order is incorrect:
```
-10
15
20
-25
-5
```
While the correct order should be:
```
-10
-25
-5
10
20
```
as '-' is smaller than '1', or '2' in ascii.
I'm doubtful this is a collation issue, as most collations basically respect ASCII order.
Even if this issue appears to be related to collation, no warnings are provided when migrating from this version.
# Please describe a way to repeat the problem.
# Please try to provide a concise reproducible example, if at all possible:
----------------------------------------------------------------------
Running the following command in PostgreSQL 14.13:
```sql
CREATE TABLE t0 (c0 TEXT);
INSERT INTO t0 (c0) VALUES ('-10'), ('20'), ('-5'), ('15'), ('-25');
SELECT c0, MIN(ABS(CAST(c0 AS BIGINT))) OVER (ORDER BY c0 NULLS FIRST) AS min_function_cast FROM t0;
DROP TABLE IF EXISTS t0;
```
Results in the incorrect sorting behavior.
If you know how this problem might be fixed, list the solution below:
---------------------------------------------------------------------
Using the correct sort criteria will solve this problem.
Hi,
On 11/14/24 13:49, [3반]김민지_4904 wrote:
>
> ...
>
> The minimized PoC is:
>
> ```sql
> CREATE TABLE t0 (c0 TEXT);
> INSERT INTO t0 (c0) VALUES ('-10'), ('20'), ('-5'), ('15'), ('-25');
> SELECT c0 FROM t0 ORDER BY c0;
> ```
>
> In PostgreSQL 14.13, the order is incorrect:
> ```
> -10
> 15
> 20
> -25
> -5
> ```
>
> While the correct order should be:
>
> ```
> -10
> -25
> -5
> 10
> 20
> ```
>
> as '-' is smaller than '1', or '2' in ascii.
>
> I'm doubtful this is a collation issue, as most collations basically
> respect ASCII order.
>
This is 99.999% due to the collation, so which collations are being used
on these systems? Also, I don't get this "incorrect" behavior on 14.13,
it behaves the same as 17 for me, producing the expected result.
regards
--
Tomas Vondra
Tomas Vondra <tomas@vondra.me> writes:
> On 11/14/24 13:49, [3반]김민지_4904 wrote:
>> I'm doubtful this is a collation issue, as most collations basically
>> respect ASCII order.
> This is 99.999% due to the collation, so which collations are being used
> on these systems? Also, I don't get this "incorrect" behavior on 14.13,
> it behaves the same as 17 for me, producing the expected result.
It surely is a collation issue. Using a glibc-based system, I get
u8=# CREATE TABLE t0 (c0 TEXT);
CREATE TABLE
u8=# INSERT INTO t0 (c0) VALUES ('-10'), ('20'), ('-5'), ('15'), ('-25');
INSERT 0 5
u8=# select * from t0 order by c0 collate "C";
c0
-----
-10
-25
-5
15
20
(5 rows)
u8=# select * from t0 order by c0 collate "en_US";
c0
-----
-10
15
20
-25
-5
(5 rows)
(In point of fact, most glibc collations do NOT "respect ASCII order".
They tend to ignore punctuation until it's needed as a tiebreaker.)
So this is surely down to the PG 14.13 installation having a different
default collation than whatever it's compared to, which most likely
is caused by having run initdb with a different locale environment.
regards, tom lane
On 14.11.24 13:49, [3반]김민지_4904 wrote: > as '-' is smaller than '1', or '2' in ascii. > > I'm doubtful this is a collation issue, as most collations basically > respect ASCII order. See also here for a possible explanation: https://peter.eisentraut.org/blog/2023/04/12/how-collation-of-punctuation-and-whitespace-works