Re: encoding names v2.

Поиск
Список
Период
Сортировка
От Peter Eisentraut
Тема Re: encoding names v2.
Дата
Msg-id Pine.LNX.4.30.0108222124120.679-100000@peter.localdomain
обсуждение исходный текст
Ответ на encoding names v2.  (Karel Zak <zakkr@zf.jcu.cz>)
Ответы Re: encoding names v2.
Re: encoding names v2.
Список pgsql-patches
Okay, here is some bad news:  I just looked into the SQL99 standard for
the names of predefined character set names, and here is the list:

SQL_CHARACTER
GRAPHIC_IRV or ASCII_GRAPHIC
LATIN1                <==== !!!
ISO8BIT or ASCII_FULL
UTF16
UTF8
UCS2
SQL_TEXT
SQL_IDENTIFIER

So perhaps we should keep the LATIN1 thing after all?  I don't like it,
but the rules...

Comments?


Karel Zak writes:

>  - getdatabaseencoding() is compatible with old versions, but
>    in the code is commented as deprecated.
>
>  - getdbencoding() is new function that return correct encoding names

See my other message about this.  I don't think this is a good choice of
names.

>  - all encoding names use '-'. I hope we will never see a problem with
>    it and some operator. Encoding names must be used as quoted string.

For SQL compliance we will need to access charset names as identifiers in
the future.  So the name normalization should take effect whereever a
charset name is expected.  I suppose this is what you did.

>    Only for SQL_ASCII is used '_', because I see that JDBC has hardcoded
>    "pg_encoding_to_char(1) = 'SQL_ASCII'" :-(((

This is okay, look at the list above for precedent.

>  - the ./configure.in:
>      * use new encoding names too for --enable-multibyte
>      * define MULTIBYTE that handle default encoding id

Where is this needed?

>      * define MULTIBYTE_NAME that handle default encoding name (neeful
>        for initdb)

Can you rename this to something like DEFAULT_CHARACTER_SET?  There is
really nothing "multibyte" here.

>  - 'initdb' check if default template encoding is correct for backend DB.
>
>     In the old code it's in initdb very hardcoded. I add to pg_encoding
>     option '-b' that check if encoding is correct for backend DB (means
>     encoding is not client only). It's better than
>     if [ $MULTIBYTEID -gt 31 ]
>                           ^^^^^^
>     in scripts.

Good.

> src/utils/mb/Unicode/KOI8_to_utf8.map  --> src/utils/mb/Unicode/KOI8R_to_utf8.map
> src/utils/mb/Unicode/WIN_to_utf8.map  --> src/utils/mb/Unicode/WIN1251_to_utf8.map
> src/utils/mb/Unicode/utf8_to_KOI8.map --> src/utils/mb/Unicode/utf8_to_KOI8R.map
> src/utils/mb/Unicode/utf8_to_WIN.map --> src/utils/mb/Unicode/utf8_to_WIN1251.map

Can you introduce some uniform capitalization (e.g., all lower case)?

>  Thanks for all suggestion.
>
>  New comments?

Don't worry, we'll get there. ;-)

--
Peter Eisentraut   peter_e@gmx.net   http://funkturm.homeip.net/~peter


В списке pgsql-patches по дате отправления:

Предыдущее
От: Barry Lind
Дата:
Сообщение: Re: encoding names v2.
Следующее
От: Tatsuo Ishii
Дата:
Сообщение: Re: encoding names