Problem in 'ORDER BY' of a column using a created collation?

Поиск
Список
Период
Сортировка
От Nishant Sharma
Тема Problem in 'ORDER BY' of a column using a created collation?
Дата
Msg-id CADrsxdbyWg75xV6sBeVCw76UE38JTW=1GJha4GAWP7jB0AK1ow@mail.gmail.com
обсуждение исходный текст
Ответы Re: Problem in 'ORDER BY' of a column using a created collation?
Список pgsql-hackers
Hi,


We encountered a strange behavior in ordering a column using a
created collation. Here are the experiments details:


Experiment 1:-
SQL File : PG_Exp_1.sql

Actual Output : PG_Exp_1.out

Created COLLATION : CREATE COLLATION test_coll (
provider = icu, locale = 'ja-u-kr-latn-digit');

'SELECT' Queries :
SELECT * FROM test_table ORDER BY value1 COLLATE
test_coll, value2 COLLATE test_coll;
SELECT * FROM test_table ORDER BY value2 COLLATE
test_coll, value1 COLLATE test_coll;

Expectation : All alphabets should come before all digits.

Seen Behavior : Column 1 in the 'ORDER BY' i.e value1 of the first
'SELECT' and Column 1 in the 'ORDER BY' i.e value2 of the second
'SELECT' is giving the correct order. But Column 2 in the 'ORDER BY'
i.e value2 in the first 'SELECT' and Column 2 in the 'ORDER BY' i.e
value1 in the second 'SELECT' is NOT giving the correct order.


Experiment 2:-
SQL File : PG_Exp_2.sql

Actual Output : PG_Exp_2.out

Created 'COLLATION' : CREATE COLLATION test_coll (
provider = icu, locale = 'ja-u-kr-digit-latn');

'SELECT' Queries : Same as 'Experiment 1'.

Expectation : All digits should come before all alphabets.

Seen Behavior : Matching with expectation. Column 1 in the
'ORDER BY' i.e value1 of the first 'SELECT' and Column 1 in the
'ORDER BY' i.e value2 of the second 'SELECT' is giving the correct
order. And Column 2 in the 'ORDER BY' i.e value2 in the first
'SELECT' and Column 2 in the 'ORDER BY' i.e value1 in the
second 'SELECT' is giving the correct order.


We did debug 'Experiment 1' and we find that:-
Whatever is the Column 1 in 'ORDER BY' gets correctly ordered,
because it uses abbreviated sort optimization due to which its data
datum gets converted to abbreviated datum using
"varstr_abbrev_convert()" function, and then the comparator
function selected is
"ssup->comparator = ssup_datum_unsigned_cmp()"
for sorting operation. But in case of column 2 in 'ORDER BY' (which
is showing incorrect result for 'Experiment 1') does not use
abbreviated sort optimization and here comparator function selected
is "ssup->comparator = varlenafastcmp_locale" -->
"strncoll_icu_utf8()", which appears, uses the third party ICU
library function for comparison and does not work as expected.


Need help in confirming why 'Experiment 1' is behaving as mentioned
above -
1. If our expectation of 'Experiment 1' is wrong?
2. Bug in abbreviated sort optimization?
3. Bug in third party comparator function "strncoll_icu_utf8()"?
4. Any other aspects which we are missing?
5. Or everything appears good?

PFA, the experiment files.


Thanks & Regards,
Nishant Sharma.
EnterpriseDB, Pune, India.
Вложения

В списке pgsql-hackers по дате отправления: