Re: [GENERAL] Changing collate & ctype for an existing database

Поиск
Список
Период
Сортировка
От Albe Laurenz
Тема Re: [GENERAL] Changing collate & ctype for an existing database
Дата
Msg-id A737B7A37273E048B164557ADEF4A58B53A81CA2@ntex2010i.host.magwien.gv.at
обсуждение исходный текст
Ответ на [GENERAL] Changing collate & ctype for an existing database  (rihad <rihad@mail.ru>)
Ответы Re: [GENERAL] Changing collate & ctype for an existing database
Список pgsql-general
rihad wrote:
> Hi there. We have a working database that was unfortunately created by
> initdb with default ("C") collation & ctype. All other locale specific
> settings have the value en_US.UTF-8 in postgresql.conf. The database
> itself is multilingual and all its data is stored in UTF-8. Sorting
> doesn't work correctly, though. To fix that, can I just do this:
> 
> 
> update pg_database set datcollate='en_US.UTF-8', datctype='en_US.UTF-8'
> where datname='mydb';
> 
> 
> This does seem to work on a testing copy of the database, i.e. select
> lower('БлаБлаБла') now works correctly when connected to that database.
> 
> 
> Is there still any chance for corrupting data by doing this, or indexes
> stopping working etc?
> 
> p.s.: postgres 9.6.3

As explained, yes.  Indexes on string columns will be corrupted.

See this example:

test=# CREATE DATABASE breakme LC_COLLATE "C" LC_CTYPE "C" TEMPLATE template0;
test=# \c breakme
breakme=# CREATE TABLE sort(id integer PRIMARY KEY, val text NOT NULL);
breakme=# INSERT INTO sort VALUES (1, 'LITTLE'), (2, 'big'), (3, 'b-less');
breakme=# CREATE INDEX ON sort(val);
breakme=# SET enable_seqscan=off;  -- force index use
breakme=# SELECT * FROM sort ORDER BY val;
┌────┬────────┐
│ id │  val   │
├────┼────────┤
│  1 │ LITTLE │
│  3 │ b-less │
│  2 │ big    │
└────┴────────┘
(3 rows)

breakme=# UPDATE pg_database SET datcollate='en_US.UTF-8', datctype='en_US.UTF-8' WHERE datname='breakme';
breakme=# \c breakme
breakme=# SET enable_seqscan=off;  -- force index use
breakme=# SELECT * FROM sort ORDER BY val;
┌────┬────────┐
│ id │  val   │
├────┼────────┤
│  1 │ LITTLE │
│  3 │ b-less │
│  2 │ big    │
└────┴────────┘
(3 rows)

breakme=# SET enable_seqscan=on;  -- this and the following force sequential scan
breakme=# SET enable_bitmapscan=off;
breakme=# SET enable_indexscan=off;
breakme=# SET enable_indexonlyscan=off;
breakme=# SELECT * FROM sort ORDER BY val;  -- this returns the correct order
┌────┬────────┐
│ id │  val   │
├────┼────────┤
│  2 │ big    │
│  3 │ b-less │
│  1 │ LITTLE │
└────┴────────┘
(3 rows)

As you see, your index is still sorted according to the C collation
and scanning it returns wrong results.

Yours,
Laurenz Albe

В списке pgsql-general по дате отправления:

Предыдущее
От: Albe Laurenz
Дата:
Сообщение: Re: [GENERAL] loading file with en dash character into postgres9.6.1 database
Следующее
От: dpat
Дата:
Сообщение: [GENERAL] Manage slot in logical/pglogical replication