Re: Character sets (Re: Re: Big 7.1 open items)

Поиск

Список

Период

Сортировка

От	Tatsuo Ishii
Тема	Re: Character sets (Re: Re: Big 7.1 open items)
Дата	21 июня 2000 г. 02:15:21
Msg-id	20000621151917D.t-ishii@sra.co.jp обсуждение исходный текст
Ответ на	Character sets (Re: Re: Big 7.1 open items) (Peter Eisentraut <peter_e@gmx.net>)
Список	pgsql-hackers

Дерево обсуждения

> But how are you going to tell a genuine "type" from a character set? And
> you might have to have three types for each charset. There'd be a lot of
> redundancy and confusion regarding the input and output functions and
> other pg_type attributes. No doubt there's something to be learned from
> the type system, but character sets have different properties -- like
> characters(!), collation rules, encoding "translations" and what not.
> There is no doubt also need for different error handling. So I think that
> just dumping every character set into pg_type is not a good idea. That's
> almost equivalent to having separate types for char(6), char(7), etc.
> 
> Instead, I'd suggest that character sets become separate objects. A
> character entity would carry around its character set in its header
> somehow. Consider a string concatenation function, being invoked with two
> arguments of the same exotic character set. Using the type system only
> you'd have to either provide a function signature for all combinations of
> characters sets or you'd have to cast them up to SQL_TEXT, concatenate
> them and cast them back to the original charset. A smarter concatentation
> function instead might notice that both arguments are of the same
> character set and simply paste them together right there.

Intersting idea. But what about collations? SQL allows to assign a
collation different from the default one to a character set on the
fly. Should we make collations as separate obejcts as well?

> Here are a couple of "items" I keep wondering about:
> 
> * To what extend would we be able to use the operating systems locale
> facilities? Besides the fact that some systems are deficient or broken one
> way or another, POSIX really doesn't provide much besides "given two
> strings, which one is greater", and then only on a per-process basis.
> We'd really need more that, see also LIKE indexing issues, and indexing in
> general.

Correct. I'd suggest completely getting ride of OS's locale.

> * Client support: A lot of language environments provide pretty smooth
> Unicode support these days, e.g., Java, Perl 5.6, and I think that C99 has
> also made some strides. So while "we can store stuff in any character set
> you want" is great, it's really no good if it doesn't work transparently
> with the client interfaces. At least something to keep in mind.

Do you suggest that we should convert everyting into Unicode and store
them into DB?
--
Tatsuo Ishii

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Character sets (Re: Re: Big 7.1 open items)