Jan Urbański wrote:
> Andrew Dunstan wrote:
>>
>>
>> Pavel Stehule wrote:
>>>
>>>
>>> One note - convert_to is correct. But we have to use to_ascii without
>>> decode functions. It has same behave - convert from bytea to text.
>>> Text in "incorrect" encoding is dafacto bytea. So correct to_ascii
>>> function prototypes are:
>>>
>>> to_ascii(text)
>>> to_ascii(bytea, integer);
>>> to_ascii(bytea, name);
>>>
>>>
>>>>
>>
>> What you have not said is how you propose to convert UTF8 to ASCII.
>>
>> Currently to_ascii() converts a small number of single byte charsets
>> to ASCII by folding the chars with high bits set, so what we get is a
>> pure ASCII result which is safe in any server encoding, as they are
>> all ASCII supersets.
>>
>> But what conversion rule will you use for the gazillions of Unicode
>> characters?
>>
>> I honestly do not understand the use case for this at all.
>
> I do. Often clients want their searches to be
> accented-or-language-specific letters insensitive. So searching for
> 'łódź' returns 'lodz'. So the use case is there (in fact, the lack of
> such facility made me consider not upgrading particular client to
> 8.3...).
> Or maybe there's a better way to do it?
Well, my first question would be "Why aren't you using a database
encoding that supports to_ascii()?"
However, I suppose that your use case would support this signature:
to_ascii(bytea, name)
where it would just error out if the encoding name were something other
than LATIN1, LATIN2, LATIN9, or WIN1250.
But what would be the meaning of this?:
to_ascii(bytea, integer)
cheers
andrew