Re: Best practices for moving UTF8 databases

Поиск
Список
Период
Сортировка
От Phoenix Kiula
Тема Re: Best practices for moving UTF8 databases
Дата
Msg-id e373d31e0907181916s7be46a45mcac18b91df6f367e@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Best practices for moving UTF8 databases  (Alvaro Herrera <alvherre@commandprompt.com>)
Ответы Re: Best practices for moving UTF8 databases
Re: Best practices for moving UTF8 databases
Список pgsql-general
On Tue, Jul 14, 2009 at 9:52 PM, Alvaro
Herrera<alvherre@commandprompt.com> wrote:
> Andres Freund wrote:
>> On Tuesday 14 July 2009 11:36:57 Jasen Betts wrote:
>
>> > if you do an ascii dump and the dump starts out "SET CLIENT ENCODING
>> > 'UTF8'" or similar but you still get errors.
>> Do you mean that a dump from SQL_ASCII can yield non-utf8 data? right. But
>> According to the OP his 8.3 database is UTF8...
>> So there should not be invalid data in there.
>
> I haven't followed this thread, but older PG versions had less strict
> checks on UTF8 data, which meant that some invalid data could creep in.



If so, how can I check for them in my old database, which is 8.2.9?
I'm now moving first to 8.3 (then to the 84).

Really, PG absolutely needs a way to upgrade the database without so
much data related downtime and all these silly woes. Several competing
database systems are a cinch to upgrade.

Anyway this is the annoying error I see as always:

  ERROR:  invalid byte sequence for encoding "UTF8": 0x80

I think my old DB is all utf8. If there are a few characters that are
not, how can I work with this? I've done everything I can to take care
of the encoding and such. This code was used to initdb:

 initdb --locale=en_US.UTF-8 --encoding=UTF8

Locale environment variables are all "en_US.UTF-8" too.

Thanks for any pointers!

В списке pgsql-general по дате отправления:

Предыдущее
От: Diego Schulz
Дата:
Сообщение: Re: initdb fails on Windows with encoding=LATIN1
Следующее
От: Pavel Stehule
Дата:
Сообщение: Re: initdb fails on Windows with encoding=LATIN1