Re: finding bogus UTF-8

Поиск
Список
Период
Сортировка
От Marko Kreen
Тема Re: finding bogus UTF-8
Дата
Msg-id AANLkTi==bd28k_J2=Dg0kcLD_mMTrLByCGrS+PHk1U-s@mail.gmail.com
обсуждение исходный текст
Ответ на finding bogus UTF-8  (Scott Ribe <scott_ribe@elevated-dev.com>)
Список pgsql-general
On Thu, Feb 10, 2011 at 9:02 PM, Scott Ribe <scott_ribe@elevated-dev.com> wrote:
> I know that I have at least one instance of a varchar that is not valid UTF-8, imported from a source with errors
(AMACPT files, actually) before PG's checking was as stringent as it is today. Can anybody suggest a query to find such
values?

CREATE OR REPLACE FUNCTION is_utf8(text)
RETURNS bool AS $$
try:
    args[0].decode('utf8')
    return True
except UnicodeDecodeError:
    return False
$$ LANGUAGE plpythonu STRICT;

--
marko

В списке pgsql-general по дате отправления:

Предыдущее
От: Alban Hertroys
Дата:
Сообщение: Re: Speeding up index scans by truncating timestamp?
Следующее
От: Vick Khera
Дата:
Сообщение: Re: finding bogus UTF-8