Re: multibyte-character aware support for function "downcase_truncate_identifier()"

Поиск
Список
Период
Сортировка
От Andrew Dunstan
Тема Re: multibyte-character aware support for function "downcase_truncate_identifier()"
Дата
Msg-id 4CE9A9B7.1080707@dunslane.net
обсуждение исходный текст
Ответ на Re: multibyte-character aware support for function "downcase_truncate_identifier()"  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: multibyte-character aware support for function "downcase_truncate_identifier()"  (Robert Haas <robertmhaas@gmail.com>)
Re: multibyte-character aware support for function "downcase_truncate_identifier()"  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
<br /><br /> On 11/21/2010 06:09 PM, Robert Haas wrote: <blockquote
cite="mid:AANLkTikweY9M4vfR0KmKwZiit-w8siSgsSk3x6iuj8Rz@mail.gmail.com"type="cite"><pre wrap="">I think that's fair.
Itactually doesn't seem like it should be that
 
hard if we knew that the server encoding were UTF8 - it's just a big
translation table somewhere, no? </pre></blockquote><br /> No, it's far more complex. See for example <a
class="moz-txt-link-rfc2396E"
href="http://unicode.org/reports/tr21/tr21-3.html"><http://unicode.org/reports/tr21/tr21-3.html></a>,which
says:<br/><blockquote><p>There are a number of complications to case mappings that occur once the repertoire of
charactersis expanded beyond ASCII. <ul><li>Because of the inclusion of certain composite characters for compatibility,
suchas 01F1 "DZ" <i>capital dz</i>, there is a third case, called <i>titlecase</i>, which is used where the first
letterof a word is to be capitalized (e.g. Titlecase, vs. UPPERCASE, or lowercase). <ul><li>For example, the title case
ofthe example character is 01F2 "Dz" <i>capital d with small z</i>.</ul><li>Case mappings may produce strings of
differentlength than the original. <ul><li>For example, the German character 00DF "ß" <i>small letter sharp s</i>
expandswhen uppercased to the sequence of two characters "SS". This also occurs where there is no precomposed character
correspondingto a case mapping, such as with 0149 "ʼn" <i>latin small letter n preceded by
apostrophe.</i></ul><li>Charactersmay also have different case mappings, depending on the context. <ul><li>For example,
03A3"Σ" <i>capital sigma</i> lowercases to 03C3 "σ" <i>small sigma</i> if it is followed by another letter, but
lowercasesto 03C2 "ς" <i>small final sigma</i> if it is not.</ul><li>Characters may have case mappings that depend on
thelocale. <ul><li>For example, in Turkish the letter 0049 "I" <i>capital letter i</i> lowercases to 0131 "ı" <i>small
dotlessi</i>.</ul><li>Case mappings are not, in general, reversible. <ul><li>For example, once the string "McGowan" has
beenuppercased, lowercased or titlecased, the original cannot be recovered by applying another uppercase, lowercase, or
titlecaseoperation.</ul></ul></blockquote><br /> cheers<br /><br /> andrew<br /><br /><br /><br /> 

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: knngist - 0.8
Следующее
От: Robert Haas
Дата:
Сообщение: Re: multibyte-character aware support for function "downcase_truncate_identifier()"