Обсуждение: unable to restore 8.2.5

Поиск
Список
Период
Сортировка

unable to restore 8.2.5

От
Tore Halset
Дата:
Hello.

I am having some trouble restoring a dump from the production system
over to a test system. Both servers are 8.2.5 and both use UTF-8. The
vrc_error table have ~36 mill rows on the production system, but none
of those are restored to the test system.

pg_restore: [archiver (db)] Error from TOC entry 1665; 0 333256 TABLE
DATA vrc_error primar
pg_restore: [archiver (db)] COPY failed: ERROR:  invalid byte sequence
for encoding "UTF8": 0x81
HINT:  This error can also happen if the byte sequence does not match
the encoding expected by the server, which is controlled by
"client_encoding".
CONTEXT:  COPY vrc_error, line 34500528

Looks like I have managed to insert an illegal character into the main
system that does not conform to UTF-8. Anything I can and should do to
work around this issue?

Btw, we also use wal shipping and warm standby for the main database
and that works perfect.

Regards,
  - Tore.

Re: unable to restore 8.2.5

От
"Mikel Lindsaar"
Дата:
On Fri, Sep 19, 2008 at 6:29 PM, Tore Halset <halset@pvv.ntnu.no> wrote:
> Looks like I have managed to insert an illegal character into the main
> system that does not conform to UTF-8. Anything I can and should do to work
> around this issue?

I have had the same problem previously and after a lot of help from
Tom Lane basically came up to the following...

You need to basically dump your table out (or a subset containing the
row ID and column that would have the bad data) in plain text and then
parse it with a script to detect invalid UTF-8 sequences, then find
what rows the bad data is in and go and fix it.

It is either that or you drop the data inserting some other character.
 But this has obvious drawbacks.

I wrote a short ruby script that goes through a dumped file line by
line and puts each line through Iconv to parse it from UTF-8 to UTF-8,
if it fails it dumps the offending line to a log file.

A ruby script that would just print the offending row would go
something like this:

require 'iconv'
File.read(ARGV[0]).each do |line|
  begin
    Iconv.iconv('UTF-8', 'UTF-8', line)
  rescue
    puts "Failed: #{line}"
  end
end

Save that in a file (find_invalid_utf8.rb) then run it with:

$ ruby find_invalid_utf8.rb my_dumped_table.csv

It's not pretty, and just dumps the raw output to the screen, but it
might do for you.

--
http://lindsaar.net/
Rails, RSpec and Life blog....

Re: unable to restore 8.2.5

От
Tore Halset
Дата:
Hello.

Thanks a lot! Using the script, I found the entries with invalid UTF-8
and fixed them in the database.

  - Tore.

On Sep 19, 2008, at 15:37 , Mikel Lindsaar wrote:

> On Fri, Sep 19, 2008 at 6:29 PM, Tore Halset <halset@pvv.ntnu.no>
> wrote:
>> Looks like I have managed to insert an illegal character into the
>> main
>> system that does not conform to UTF-8. Anything I can and should do
>> to work
>> around this issue?
>
> I have had the same problem previously and after a lot of help from
> Tom Lane basically came up to the following...
>
> You need to basically dump your table out (or a subset containing the
> row ID and column that would have the bad data) in plain text and then
> parse it with a script to detect invalid UTF-8 sequences, then find
> what rows the bad data is in and go and fix it.
>
> It is either that or you drop the data inserting some other character.
> But this has obvious drawbacks.
>
> I wrote a short ruby script that goes through a dumped file line by
> line and puts each line through Iconv to parse it from UTF-8 to UTF-8,
> if it fails it dumps the offending line to a log file.
>
> A ruby script that would just print the offending row would go
> something like this:
>
> require 'iconv'
> File.read(ARGV[0]).each do |line|
>  begin
>    Iconv.iconv('UTF-8', 'UTF-8', line)
>  rescue
>    puts "Failed: #{line}"
>  end
> end
>
> Save that in a file (find_invalid_utf8.rb) then run it with:
>
> $ ruby find_invalid_utf8.rb my_dumped_table.csv
>
> It's not pretty, and just dumps the raw output to the screen, but it
> might do for you.
>
> --
> http://lindsaar.net/
> Rails, RSpec and Life blog....
>


Re: unable to restore 8.2.5

От
"Mikel Lindsaar"
Дата:
On Fri, Sep 26, 2008 at 7:12 PM, Tore Halset <halset@pvv.ntnu.no> wrote:
> Thanks a lot! Using the script, I found the entries with invalid UTF-8 and
> fixed them in the database.

That's great news.  You are welcome.

Curious, how long did it take to run on your entire table?  You
mentioned it was fairly big.

Mikel