Re: finding bogus UTF-8

Поиск
Список
Период
Сортировка
От dennis jenkins
Тема Re: finding bogus UTF-8
Дата
Msg-id AANLkTikW0Yb8ZyULMVLgdLsv-A+6O_TF3rHmtdxX8mUG@mail.gmail.com
обсуждение исходный текст
Ответ на finding bogus UTF-8  (Scott Ribe <scott_ribe@elevated-dev.com>)
Список pgsql-general
On Thu, Feb 10, 2011 at 1:02 PM, Scott Ribe <scott_ribe@elevated-dev.com> wrote:
> I know that I have at least one instance of a varchar that is not valid UTF-8, imported from a source with errors
(AMACPT files, actually) before PG's checking was as stringent as it is today. Can anybody suggest a query to find such
values?

If you know which table and column the data is in, you can also do
something like this:

(I typed this up without checking the syntax of it.  The basic idea is
to cast the column as bytea, encode with the 'escape' method, then
grep for back-slashes).

select * from bad_table where regexp_match (encode (bad_column::bytea,
'escape'), '\\\\'));

В списке pgsql-general по дате отправления:

Предыдущее
От: dennis jenkins
Дата:
Сообщение: Re: finding bogus UTF-8
Следующее
От: Andy Colson
Дата:
Сообщение: Re: Schema version control