Comparison semantics of CHAR data type

Поиск
Список
Период
Сортировка
От Thomas Fanghaenel
Тема Comparison semantics of CHAR data type
Дата
Msg-id CAK+WP1xdmyswEehMuetNztM4H199Z1w9KWRHVMKzyyFM+hV=zA@mail.gmail.com
обсуждение исходный текст
Список pgsql-sql
I was wondering about the proper semantics of CHAR comparisons in some corner cases that involve control characters with values that are less than 0x20 (space).

Consider the following testcase:

===
create table t (a int, b char(10));

insert into t values (1, 'foo');
insert into t values (2, 'foo ');
insert into t values (3, E'foo\t');
insert into t values (4, E'foo\n');
insert into t values (5, E'foo \n');
insert into t values (6, 'foobar');

select * from t order by b;
===

What's the proper order of these string values in the CHAR domain?  The way I interpret the SQL Standard (and assuming that \t and \n collate lower than a space), it's supposed to be this:

  (3) < (4) < (5) < (1) <= (2) < (6)

Postgres comes up with this:

  (1) <= (2) < (3) < (4) < (5) < (6)

The reason is that the bpchar functions that implement the relative comparison operators for CHAR(n) effectively strip trailing whitespaces before doing the comparison.  One might argue that doing this is not correct.  The standard seems to mandate that all CHAR(n) values are actually considered to be of width n, and that trailing spaces are indeed relevant for comparison.  In other words, stripping them would only be possible if it can be guaranteed that there are no characters in the character set that collate lower than a space.

Any thoughts on this?  I searched the mailing list archives, but couldn't find any relevant discussion.  There were plenty of threads that argue whether or not it's semantically correct to strip trailing spaces from CHAR(n) values, but the issue of characters collating below a space does not seem to have brought up in any of those discussions before.

Cheers,

-- Thomas

В списке pgsql-sql по дате отправления:

Предыдущее
От: James Sharrett
Дата:
Сообщение: the value of OLD on an initial row insert
Следующее
От: Luca Ferrari
Дата:
Сообщение: Re: the value of OLD on an initial row insert