Re: proposal: UTF8 to_ascii function

Поиск

Список

Период

Сортировка

От	Andrew Dunstan
Тема	Re: proposal: UTF8 to_ascii function
Дата	11 августа 2008 г. 10:42:43
Msg-id	48A041CC.1090703@dunslane.net обсуждение
Ответ на	Re: proposal: UTF8 to_ascii function (Jan Urbański <j.urbanski@students.mimuw.edu.pl>)
Ответы	Re: proposal: UTF8 to_ascii function Re: proposal: UTF8 to_ascii function
Список	pgsql-hackers

Дерево обсуждения


Jan Urbański wrote:
> Andrew Dunstan wrote:
>>
>>
>> Pavel Stehule wrote:
>>>
>>>
>>> One note - convert_to is correct. But we have to use to_ascii without
>>> decode functions. It has same behave - convert from bytea to text.
>>> Text in "incorrect" encoding is dafacto bytea. So correct to_ascii
>>> function prototypes are:
>>>
>>> to_ascii(text)
>>> to_ascii(bytea, integer);
>>> to_ascii(bytea, name);
>>>
>>>  
>>>>     
>>
>> What you have not said is how you propose to convert UTF8 to ASCII.
>>
>> Currently to_ascii() converts a small number of single byte charsets 
>> to ASCII by folding the chars with high bits set, so what we get is a 
>> pure ASCII result which is safe in any server encoding, as they are 
>> all ASCII supersets.
>>
>> But what conversion rule will you use for the gazillions of Unicode 
>> characters?
>>
>> I honestly do not understand the use case for this at all.
>
> I do. Often clients want their searches to be 
> accented-or-language-specific letters insensitive. So searching for 
> 'łódź' returns 'lodz'. So the use case is there (in fact, the lack of 
> such facility made me consider not upgrading particular client to 
> 8.3...).
> Or maybe there's a better way to do it?

Well, my first question would be "Why aren't you using a database 
encoding that supports to_ascii()?"

However, I suppose that your use case would support this signature:
   to_ascii(bytea, name)

where it would just error out if the encoding name were something other 
than LATIN1, LATIN2, LATIN9, or WIN1250.

But what would be the meaning of this?:
   to_ascii(bytea, integer)


cheers

andrew

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: proposal: UTF8 to_ascii function