Re: Locales and Encodings

Поиск
Список
Период
Сортировка
От Gregory Stark
Тема Re: Locales and Encodings
Дата
Msg-id 87ve9cfpsc.fsf@oxford.xeocode.com
обсуждение исходный текст
Ответ на Re: Locales and Encodings  (Peter Eisentraut <peter_e@gmx.net>)
Ответы Re: Locales and Encodings  (Martijn van Oosterhout <kleptog@svana.org>)
Список pgsql-hackers
"Peter Eisentraut" <peter_e@gmx.net> writes:

> Am Freitag, 12. Oktober 2007 schrieb Gregory Stark:
>> . when creating a new database from a template the new locale and encoding
>>   must be identical to the template database's encoding and locale. Unless
>> the template is template0 in which case we rebuild all indexes after
>> copying.
>
> Why would you restrict the index rebuilding only to this particular case?  It
> could be done for any database.

Well there's no guarantee there isn't 8-bit data in other databases which
would be invalid in the new encoding. I think it's reasonable to assume
there's only 7-bit ascii in template0 however.

An alternative would be introducing an ASCII7 encoding which template0 would
use and any other database in that encoding could be used as a template for
any encoding. However that would still require index rebuilds which would
potentially take a long time. Another alternative would be recoding all the
data from the template database encoding to the new encoding and throwing an
error if a non-encodable character is found.

I think it's a lot simpler to just declare it a non-problem by saying there
won't be any non-ascii text in template0.

> The other issue are shared catalogs.

This approach doesn't address that but I don't think it makes the problems
there any worse either. That is, I think already have these problems around
shared tables.

. If you have two databases with locales that don't agree then the indexes on those tables won't function properly.

. What happens if you create a user while connected to a latin1 database with an é in his username and then connect to
adatabase in a UTF8 database? That username is now an invalidly encoded UTF8 string. 

Perhaps we should be using pattern_ops for the indexes on the shared tables?
Or using bytea with UTF8 encoded strings instead of name and text? That
actually sounds reasonable now that we have convert() functions which take and
generate bytea, at least for the text fields like in pltemplate -- less so for
the name columns.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Mario Weilguni
Дата:
Сообщение: Re: pg_restore oddity?
Следующее
От: "Trevor Talbot"
Дата:
Сообщение: Re: Locale + encoding combinations