Re: [HACKERS] Multibyte in autoconf

Поиск
Список
Период
Сортировка
От Tatsuo Ishii
Тема Re: [HACKERS] Multibyte in autoconf
Дата
Msg-id 19991208233152I.t-ishii@sra.co.jp
обсуждение исходный текст
Ответ на Re: [HACKERS] Multibyte in autoconf  (Peter Eisentraut <e99re41@DoCS.UU.SE>)
Ответы Re: [HACKERS] Multibyte in autoconf  (Peter Eisentraut <peter_e@gmx.net>)
Список pgsql-hackers
> > >     If no --pgencoding, you get default (non-multibyte) coding even
> > >     if you compiled with --enable-mb.
> > 
> > Not agreed. I think it would be better to give an error if no default
> > encoding is not sepecified if configured with --enable-mb.  Reasons:
> > 
> > 1) Users tend to use only one encoding rather than switching multiple
> > encoding database. Thus major encoding for the user should be properly
> > set as the default.
> 
> Users also initdb only once, and that is the time to *choose* what they
> want. Then and only then. Once they're done with that they'll never have
> to worry about it again.
> 
> > 2) if non-multibyte coding such as SQL_ASCII is accidently set as the
> > default, and if a multi-byte user create a database with no encoding
> > arugument, the result would be a disaster.
> 
> Huh, so if I compile my database with multibyte and then I then I choose
> to not have a default encoding in template1 but maybe I want to have the
> multibyte option available for some other database later on, that will be
> a disaster? Not so good.

First of all, it's not possible not to have a default encoding in
template1. Probably you mean you choose SQL_ASCII (encoding no. is 0)
as the defaut encoding. Anyway, I'm going to give an example scenario
of the disaster.

1) initdb with no encoding augument (suppose that SQL_ASCII is set as
the default encoding in template1)

2) a user creates a database with no encoding augument. he thought
that the default encoding is EUC_JP.

3) he makes a table then fills it with some Japanese data.

4) later he pulls data from the table and found that it no longer
Japanese!

> What I'm also thinking of is the the package maintainer. They should be
> able to provide a "neutral" yet multibyte (and locale, and cyrillic)
> enabled package, and one should be able to use that even if one doesn't
> want to use the multibyte features right now or at all.

So you think a postgres package with multibyte/locale/cyrillic options
enabled is a good thing for everyone? At least I don't like locale
option. It is not only useless for multibyte languages such as
Japanese, but it makes slow for text comparison. I wouldn't say locale
is useless for everyone, however. I admit it is usefull for single
byte encodings.

I think it would be very hard to make a unified ideal package for
everyone.

> Also, it should not be initdb's job to verify that the encodings are
> correct, supported, etc. The backend should find that out itself. That
> eliminates duplication of the same logic, which the backend can do better
> anyway.

Actually that duplication can be eliminated by using the same
code. I think pg_id command will do the job.

BTW, I don't think the current implmentation of multibyte is not yet
completed.  Next target would be NATIONAL CHARATER support (not sure
it's for 7.0, though).  I would like to find a solution for the
problem of locale I stated above.
--
Tatsuo Ishii


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Brian E Gallew
Дата:
Сообщение: Re: [HACKERS] Table aliases in delete statements?
Следующее
От: Don Schindhelm
Дата:
Сообщение: Free SQLweb interface to postgresql w/E-Commerce capabilities