Re: [SQL] Comparison semantics of CHAR data type

Поиск
Список
Период
Сортировка
От Kevin Grittner
Тема Re: [SQL] Comparison semantics of CHAR data type
Дата
Msg-id 1381525130.59803.YahooMailNeo@web162902.mail.bf1.yahoo.com
обсуждение исходный текст
Ответ на Re: [SQL] Comparison semantics of CHAR data type  (Bruce Momjian <bruce@momjian.us>)
Ответы Re: [SQL] Comparison semantics of CHAR data type  (Thomas Fanghaenel <tfanghaenel@salesforce.com>)
Список pgsql-hackers
Bruce Momjian <bruce@momjian.us> wrote:
> Thomas Fanghaenel wrote:

>> I was wondering about the proper semantics of CHAR comparisons in some corner
>> cases that involve control characters with values that are less than 0x20
>> (space).

What matters in general isn't where the characters fall when
comparing individual bytes, but how the strings containing them
sort according to the applicable collation.  That said, my
recollection of the spec is that when two CHAR(n) values are
compared, the shorter should be blank-padded before making the
comparison.  *That* said, I think the general advice is to stay
away from CHAR(n) in favor or VARCHAR(n) or TEXT, and I think that
is good advice.

> I am sorry for this long email, but I would be interested to see what
> other hackers think about this issue.

Since we only have the CHAR(n) type to improve compliance with the
SQL specification, and we don't generally encourage its use, I
think we should fix any non-compliant behavior.  That seems to mean
that if you take two CHAR values and compare them, it should give
the same result as comparing the same two values as VARCHAR using
the same collation with the shorter value padded with spaces.

So this is correct:

test=# select 'ab'::char(3) collate "en_US" < E'ab\n'::char(3) collate "en_US";
 ?column?
----------
 t
(1 row)

... because it matches:

test=# select 'ab '::varchar(3) collate "en_US" < E'ab\n'::varchar(3) collate "en_US";
 ?column?
----------
 t
(1 row)

But this is incorrect:

test=# select 'ab'::char(3) collate "C" < E'ab\n'::char(3) collate "C";
 ?column?
----------
 t
(1 row)

... because it doesn't match:

test=# select 'ab '::varchar(3) collate "C" < E'ab\n'::varchar(3) collate "C";
 ?column?
----------
 f
(1 row)

Of course, I have no skin in the game, because it took me about two
weeks after my first time converting a database with CHAR columns
to PostgreSQL to change them all to VARCHAR, and do that as part of
all future conversions.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Josh Berkus
Дата:
Сообщение: Re: Auto-tuning work_mem and maintenance_work_mem
Следующее
От: Tom Lane
Дата:
Сообщение: Re: [COMMITTERS] pgsql: Replace duplicate_oids with Perl implementation