Обсуждение: Help with High value unicode characters

Поиск
Список
Период
Сортировка

Help with High value unicode characters

От
"Chris Hoover"
Дата:
We need some help, we have some what we believe are high value unicode characters (Unicode 0x2).  How can you search and replace for these?  We are storing this data in a text field, and having the data contain this unicode value is violating our xml rules the application uses and causing abends in our application.

 Obviously our developers will have to fix this in the application, but how do we fix it in our database.  We are needed to clean up the existing data so our customers can view the data that has been imported up to this point.

Thanks for any advice,

Chris

Re: Help with High value unicode characters

От
Michael Fuhr
Дата:
On Tue, Aug 07, 2007 at 05:09:35PM -0400, Chris Hoover wrote:
> We need some help, we have some what we believe are high value unicode
> characters (Unicode 0x2).

What do you mean by "high value unicode characters (Unicode 0x2)"?
Characters with code points in a plane other than Plane 0 (BMP,
Basic Multilingual Plane), i.e., with a code point greater than
U+FFFF?

> How can you search and replace for these?  We are storing this data
> in a text field, and having the data contain this unicode value is
> violating our xml rules the application uses and causing abends in
> our application.

If I understand what you're asking then you should be able to use
regexp_replace (8.1 and later) to fix the data.  Example:

UPDATE tablename
   SET columnname = regexp_replace(columnname, E'[\\U00010000-\\U0010FFFF]+', '', 'g')
 WHERE columnname ~ E'[\\U00010000-\\U0010FFFF]';

If that doesn't help then please clarify the problem.

--
Michael Fuhr