Re: [SPAM]-D] How to find broken UTF-8 characters ?

Поиск
Список
Период
Сортировка
От Jasen Betts
Тема Re: [SPAM]-D] How to find broken UTF-8 characters ?
Дата
Msg-id hreauf$cma$1@reversiblemaps.ath.cx
обсуждение исходный текст
Ответ на How to find broken UTF-8 characters ?  (Andreas <maps.on@gmx.net>)
Список pgsql-sql
On 2010-04-29, Andreas <maps.on@gmx.net> wrote:
> Hi,
>
> while writing the reply below I found it sounds like beeing OT but it's 
> actually not.
> I just need a way to check if a collumn contains values that CAN NOT be 
> converted from Utf8 to Latin1.
> I tried:
> Select convert_to (my_column::text, 'LATIN1') from my_table;
>
> It raises an error that says translated:
> ERROR:  character 0xe28093 in encoding »UTF8« has no equivalent in »LATIN1«

use a regular expression.
ISO8859-1 is easy, all the caracters a grouped together in unicode so
the regular expression consists of a single inverted range class
SELECT pkey FROM tabname WHERE ( textfield || textfiled2 || textfield3 ) ~ ('[^'||chr(1)||'-'||chr(255)||']');



В списке pgsql-sql по дате отправления:

Предыдущее
От: DM
Дата:
Сообщение: Re: problem converting strings to timestamps with time zone
Следующее
От: Jasen Betts
Дата:
Сообщение: Re: [SPAM]-D] How to find broken UTF-8 characters ?