Re: [PATCHES] encoding names

Поиск
Список
Период
Сортировка
От Tatsuo Ishii
Тема Re: [PATCHES] encoding names
Дата
Msg-id 20010819110257J.t-ishii@sra.co.jp
обсуждение исходный текст
Ответы Re: [PATCHES] encoding names  (Karel Zak <zakkr@zf.jcu.cz>)
encoding: ODBC, createdb  (Karel Zak <zakkr@zf.jcu.cz>)
Список pgsql-hackers
>  Hi,
> 
>  attached is patch with:
> 
> - new encoding names stuff with better performance (binary search
>   intead for() and prevent some needless searching)
> 
> - possible is use synonyms for encoding (an example ISO-8859-1, 
>   Latin1, l1)
> 
> - implemented is Peter's idea about "encoding names clearing" 
>   (other chars than [A-Za-z0-9] are irrelevan -- 'ISO-8859-1' is 
>   same as 'iso8859_1' or iso-8-8-5-9-1 :-)  
> 
> - share routines for this between FE and BE (never more define 
>   encoding names separate in FE and BE)
> 
> - add prefix PG_ to encoding identificator macros, something like 'ALT' 
>   is pretty dirty in source code, rather use PG_ALT.
> 
>  (Note: patch add new file mb/encname.c and remove mb/common.c)
> 
>                 Karel

Thanks for the patches, but...

1) There is a compiler error if --enable-unicode-conversion is not  enabled

2) The patches break createdb. createdb should raise an error if  client-only encodings such as SJIS etc. is
specified.

3) I don't like following ugliness. Why not changing all of SQL_ASCII  occurrences in the sources.
  /*   * A lot of PG stuff use 'SQL_ASCII' without prefix (dirty...)    */    #define SQL_ASCII    PG_SQL_ASCII

4) Encoding "official" names are inconsistent. Here are my suggested  changes (referring
http://www.iana.org/assignments/character-sets, according to Peter's suggestiuon):
 
   ALT -> IBM866   KOI8 -> KOI8_R   UNICODE -> UTF_8 (Peter's suggestion)      Also, I'm wondering why windows-1251,
notwindows_1251? or   ISO_8859_1, not ISO-8859-1? there seems a confusion about the   usage of "_" and "-".
 

pg_enc2name pg_enc2name_tbl[] =
{{ "SQL_ASCII",    PG_SQL_ASCII },{ "EUC_JP",    PG_EUC_JP },{ "EUC_CN",    PG_EUC_CN },{ "EUC_KR",    PG_EUC_KR },{
"EUC_TW",   PG_EUC_TW },{ "UNICODE",    PG_UNICODE },{ "MULE_INTERNAL",PG_MULE_INTERNAL },{ "ISO_8859_1",    PG_LATIN1
},{"ISO_8859_2",    PG_LATIN2 },{ "ISO_8859_3",    PG_LATIN3 },{ "ISO_8859_4",    PG_LATIN4 },{ "ISO_8859_5",
PG_LATIN5},{ "KOI8",    PG_KOI8 },{ "window-1251",PG_WIN1251 },{ "ALT",    PG_ALT },{ "Shift_JIS",    PG_SJIS },{
"Big5",   PG_BIG5 },{ "window-1250",PG_WIN1251 }
 
};



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Progress report on locale safe LIKE indexing
Следующее
От: "Serguei Mokhov"
Дата:
Сообщение: Re: Re: [PATCHES] encoding names