8.3.0, locales, and encodings

Поиск
Список
Период
Сортировка
От Dean Gibson (DB Administrator)
Тема 8.3.0, locales, and encodings
Дата
Msg-id 47AA6CEE.4010406@ultimeth.com
обсуждение исходный текст
Ответы Re: 8.3.0, locales, and encodings  (Tino Schwarze <postgresql@tisc.de>)
Список pgsql-admin
I got a small surprise when upgrading to 8.3.0 today (CentOS 4.4).  When
I went to restore my database, PSQL complained that my "CREATE
DATABASE... ENCODING 'UTF8';" wasn't valid (the new phpBB 3.0 software
enforces using UTF8).  Sure enough, a bit of research showed that the
supplied /etc/init.d/postgresql file had no encoding specified, which
meant that it defaulted to my operating system's 'en_US.ISO-8859-1'
encoding.  A quick reading of the PostgreSQL documentation revealed that
PostgreSQL can't really use an encoding different from that specified
during INITDB (caveat below), and apparently that is now enforced in v8.3.

OK, so I manually run INITDB w/ "-E UTF8", and now I can create/load one
of my databases.  Unfortunately, when I go to create another database
(where the data is 'ISO-8859-1'), now IT won't create.  A more detailed
reading of the PostgreSQL documentation revealed that PostgreSQL can't
use an encoding different from that specified during INITDB, UNLESS
INITDB is run w/ "-locale POSIX".

Ok, I rerun INITDB w/ "-locale POSIX  -E UTF8", and now both databases
create/load, apparently properly.  I also did an INITDB on another
database server, running INITDB w/ just "-locale POSIX" (apparently
meaning that the "template1" encoding defaulted to "SQL_ASCII"), and
successfully create/reload another database there (explicitly specifying
an ENCODING there).

Questions/suggestions:

1. The provided /etc/init.d/postgresql file seems to be part of this
trap.  Should it be changed to default the encoding to POSIX?
2. I want my installation to have the maximum flexibility.  Should I be
running INITDB w/ "-locale POSIX  -E UTF8", or just "-locale POSIX"
(implying "-E SQL_ASCII")?  Or does it make a difference, since I always
specify the desired database encoding when I "CREATE DATABASE...
ENCODING '...';"?
3. One of my databases is ISO-8859-1, because that's the format of the
data I get from the US gov't.  However, should I instead use the 'UTF8'
encoding for that database, and then use a client (PSQL) encoding of
"LATIN1", and let PostgreSQL convert the data?  Pros & cons?

Performance is a concern, but disk space is not.

I presume this ("admin") is the proper PostgreSQL list to post this to.
If not, let me know.

-- Dean Gibson

В списке pgsql-admin по дате отправления:

Предыдущее
От: antonio freitas
Дата:
Сообщение: password management
Следующее
От: "Mag Gam"
Дата:
Сообщение: Re: rename database