Обсуждение: Please change default characterset for database cluster

Поиск
Список
Период
Сортировка

Please change default characterset for database cluster

От
"CN"
Дата:
Hi!
"initdb" use SQL_ASCII as the default characterset encoding when it is
not given option "-E" and when it can not correctly derive one from
locale. I suggest "initdb" use UNICODE instead of SQL_ASCII because
UNICODE is far more useful than SQL_ASCII.

Not all webmasters are willing to spend time reading "initdb"
documentation. I have encountered a free web hosting providing
PhpPgAdmin through which I can create my databases. Problem is that all
newly created databases use SQL_ASCII which is completely useless to me.
Their PhpPgAdmin script does not support "-E" switch for "createdb". As
a result, I have to abandon that service all together. Was "initdb"
using UNICODE as the default characterset, everthing would be perfect.

Regards,

CN

--
http://www.fastmail.fm - Same, same, but different�


Re: Please change default characterset for database cluster

От
Carlos Moreno
Дата:
CN wrote:
> Hi!
> "initdb" use SQL_ASCII as the default characterset encoding when it is
> not given option "-E" and when it can not correctly derive one from
> locale. I suggest "initdb" use UNICODE instead of SQL_ASCII because
> UNICODE is far more useful than SQL_ASCII.
>
> Not all webmasters are willing to spend time reading "initdb"
> documentation. I have encountered a free web hosting providing
> PhpPgAdmin through which I can create my databases. Problem is that all
> newly created databases use SQL_ASCII which is completely useless to me.
> Their PhpPgAdmin script does not support "-E" switch for "createdb". As
> a result, I have to abandon that service all together. Was "initdb"
> using UNICODE as the default characterset, everthing would be perfect.

In addition to the general comment that the world does not necessarily
revolve around you, and that you should not expect all software products
in the world to be customized to suit *your* needs, I have to highlight
how horrifying this is:

> Not all webmasters are willing to spend time reading "initdb"
> documentation.

This is truly horrifying --- well, fortunately, one could hope that it
is as wrong as the rest of your message; that dumb and lazy end users
and computer illiterate people are not willing to spend time reading
documentation or instructions is ok... But webmasters and database
administrators??? Do you *seriously* expect that some highly complex
software like a DB server should be handled by people who are not
willing to read documentation???? That's the most preposterous notion
I've read in the last few months!

Another detail to add --- for a lot of people, Unicode is a useless
feature that has a very important performance hit. For a *very large*
fraction of applications, I see it generally advised to use a database
with no encoding (which SQL_ASCII essentially is), and in the situations
where some locale-aware processing is needed, then the client
application can do it.

Of course, there are also many many applications where a DB with
Unicode encoding is very useful. In those cases, the administrators
can create a database with Unicode encoding (you seem to be one of
those that are too busy to be willing to spend time reading the
documentation of *createdb*), regardless of what default encoding was
specified with initdb.

Oh, and BTW, welcome to version 8 of PostgreSQL ... The default
encoding for initdb is ..... Ta-daaaa!!! Unicode !!!

Carlos
--


Re: Please change default characterset for database cluster

От
Michael Fuhr
Дата:
On Fri, Sep 28, 2007 at 09:32:43PM -0400, Carlos Moreno wrote:
> Oh, and BTW, welcome to version 8 of PostgreSQL ... The default
> encoding for initdb is ..... Ta-daaaa!!! Unicode !!!

No, it isn't.  If you get UTF8 (formerly UNICODE) as a default then
it's because initdb is picking it up from your environment.

http://www.postgresql.org/docs/8.2/interactive/app-initdb.html

"The default is derived from the locale, or SQL_ASCII if that does not work."

--
Michael Fuhr

Re: Please change default characterset for database cluster

От
Tom Lane
Дата:
Michael Fuhr <mike@fuhr.org> writes:
> No, it isn't.  If you get UTF8 (formerly UNICODE) as a default then
> it's because initdb is picking it up from your environment.

Which initdb has done since 8.0.  If the OP is such a rabid UTF8 fan,
one wonders why his default locale setting isn't using UTF8 ...

            regards, tom lane

Re: Please change default characterset for database cluster

От
Ron Johnson
Дата:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 09/28/07 21:12, Tom Lane wrote:
> Michael Fuhr <mike@fuhr.org> writes:
>> No, it isn't.  If you get UTF8 (formerly UNICODE) as a default then
>> it's because initdb is picking it up from your environment.
>
> Which initdb has done since 8.0.  If the OP is such a rabid UTF8 fan,
> one wonders why his default locale setting isn't using UTF8 ...

He uses Windows?

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG/bXTS9HxQb37XmcRAjquAJ9EkSRbr4oHmQsFgKbSl7KJzZbqwgCfWp41
6ccK8ThWWoScU9yv3nCq3xQ=
=xcfs
-----END PGP SIGNATURE-----

Re: Please change default characterset for database cluster

От
Carlos Moreno
Дата:
Michael Fuhr wrote:
> On Fri, Sep 28, 2007 at 09:32:43PM -0400, Carlos Moreno wrote:
>
>> Oh, and BTW, welcome to version 8 of PostgreSQL ... The default
>> encoding for initdb is ..... Ta-daaaa!!! Unicode !!!
>>
>
> No, it isn't.  If you get UTF8 (formerly UNICODE) as a default then
> it's because initdb is picking it up from your environment.
>
> http://www.postgresql.org/docs/8.2/interactive/app-initdb.html
>
> "The default is derived from the locale, or SQL_ASCII if that does not work."
>

Right --- I made the "over-assumption" based on the fact that all the
systems
where I've installed it (all Fedora flavors of Linux) use UTF8 as system
locale,
and thus that one in a sense becomes the "default" ...   Not sure about
other
flavors of Unix, but certainly on the Windows world all bets would be off
(not like anyone would care, huh?  ;-) )

Carlos
--


Re: Please change default characterset for database cluster

От
Magnus Hagander
Дата:
Ron Johnson wrote:
> On 09/28/07 21:12, Tom Lane wrote:
>> Michael Fuhr <mike@fuhr.org> writes:
>>> No, it isn't.  If you get UTF8 (formerly UNICODE) as a default then
>>> it's because initdb is picking it up from your environment.
>> Which initdb has done since 8.0.  If the OP is such a rabid UTF8 fan,
>> one wonders why his default locale setting isn't using UTF8 ...
>
> He uses Windows?

Just FYI: The next version of the Windows installer will attempt to pick
up the locale from the environment. If that succeeds, it will use that
locale and UNICODE encoding. Only if that fails will it pick SQL_ASCII.

//Magnus

Re: Please change default characterset for database cluster

От
"CN"
Дата:
On Fri, 28 Sep 2007 21:32:43 -0400, "Carlos Moreno"
<moreno_pg@mochima.com> said:
> CN wrote:
> > Hi!
> > "initdb" use SQL_ASCII as the default characterset encoding when it is
> > not given option "-E" and when it can not correctly derive one from
> > locale. I suggest "initdb" use UNICODE instead of SQL_ASCII because
> > UNICODE is far more useful than SQL_ASCII.
> >
>
> In addition to the general comment that the world does not necessarily
> revolve around you, and that you should not expect all software products
> in the world to be customized to suit *your* needs, I have to highlight
> how horrifying this is:
>
> > Not all webmasters are willing to spend time reading "initdb"
> > documentation.
> This is truly horrifying --- well, fortunately, one could hope that it
> is as wrong as the rest of your message; that dumb and lazy end users
> and computer illiterate people are not willing to spend time reading
> documentation or instructions is ok... But webmasters and database
> administrators??? Do you *seriously* expect that some highly complex
> software like a DB server should be handled by people who are not
> willing to read documentation???? That's the most preposterous notion
> I've read in the last few months!
>
> Another detail to add --- for a lot of people, Unicode is a useless
> feature that has a very important performance hit. For a *very large*
> fraction of applications, I see it generally advised to use a database
> with no encoding (which SQL_ASCII essentially is), and in the situations
> where some locale-aware processing is needed, then the client
> application can do it.
>
> Of course, there are also many many applications where a DB with
> Unicode encoding is very useful. In those cases, the administrators
> can create a database with Unicode encoding (you seem to be one of
> those that are too busy to be willing to spend time reading the
> documentation of *createdb*), regardless of what default encoding was
> specified with initdb.
>
> Oh, and BTW, welcome to version 8 of PostgreSQL ... The default
> encoding for initdb is ..... Ta-daaaa!!! Unicode !!!
>
> Carlos

Various people have various perceptions. I don't feel that my suggestion
only serves to make PostgreSQL become a software product fitting only
*myself*. On the contrary, I believe PostgreSQL will become suitable for
more novice users if initdb will use UNICODE as the default characterset
when it is not given option "-E" and when it can not correctly derive a
characterset from locale.

As I stated, not all webmasters or DBA's are advanced software
administrators. I wonder there are many many webmasters and DBA's in the
world try to setup their web sites and only use a mouse but never use
their keyboards and read manuals. And I wonder this is one of the
reasons making MyZql so popular - so much popular than PostgreSQL
although it is far less powerful and has much less features than the
latter. I have been using PostgreSQL since 6.5.x. I chose it because I
noticed that PostgresSQL was the only open source DBMS that supports
subquery and user defined functions that time. But how come MyZql
becomes more popular than PostgreSQL today? I have my own answers to
this:

1. "MyZql" is easier to pronounce and remember than "PostgreSQL".
2. MyZql rolled out MyZql.exe earlier than PostgreSQL.

Answer 1 is a very important reason but I don't intend to talk about it
here. I believe MyZql's success in terms of market share is largely
contributed by its Windowz product. Why? Becasue many (and perhaps most)
people started their businesses by using a mouse. They are obviously not
advanced DBA nor experts at the begining. However, they felt they
successfully got their jobs done only with a mouse!

I feel PostgreSQL can also consider this marketing strategy:

As it has always been providing andvanced features for andvanced users,
but also first help novices, who knows only how to use a mouse, get
their jobs done.

Yes, UNICODE results in poorer performance than SQL_ASCII. However, this
is not a problem at all because advanced users will use "-E" when they
only needs SQL_ASCII. On the contrary, novices who actually needs
UNICODE but get SQL_ASCII after PostgreSQL installation usually walk
away and embrace MyZql which appears to be able to always help them
setup their first web site with a few mouse clicks and with all the
default values prompted by MyZql-install.exe.

As in my unhappy experience, the webmaster must have used initdb without
"-E" option to initialized his database cluster. He also used a cPanel
which does not provide "-E" option for createdb. I posted a request to
that site asking for providing "-E" option for createdb by his cPanel.
That webmaster said that he can not program cPanel. Another user replied
me by asking: "What don't you simply use MyZql?".  The net result is
that I left his site and reduced the total number of PostgreSQL users
from his site.

Regards,

CN

--
http://www.fastmail.fm - The way an email service should be


Re: Please change default characterset for database cluster

От
brian
Дата:
>>CN wrote:
> ... MyZql ... MyZql
>
> 1. "MyZql" is easier to pronounce and remember than "PostgreSQL".

Actually, that's *MySQL*.

> Yes, UNICODE results in poorer performance than SQL_ASCII. However, this
> is not a problem at all because advanced users will use "-E" when they
> only needs SQL_ASCII. On the contrary, novices who actually needs
> UNICODE but get SQL_ASCII after PostgreSQL installation usually walk
> away and embrace MyZql which appears to be able to always help them
> setup their first web site with a few mouse clicks and with all the
> default values prompted by MyZql-install.exe.

The default for MySQL is latin1 with swedish sorting.

> As in my unhappy experience, the webmaster must have used initdb without
> "-E" option to initialized his database cluster. He also used a cPanel
> which does not provide "-E" option for createdb. I posted a request to
> that site asking for providing "-E" option for createdb by his cPanel.
> That webmaster said that he can not program cPanel. Another user replied
> me by asking: "What don't you simply use MyZql?".  The net result is
> that I left his site and reduced the total number of PostgreSQL users
> from his site.

You've reduced the total number of users, period. That's as it should
be. It's really quite simple: if they cannot give you the level of
service you require then go somewhere else. It's silly to suggest that
the Postgres developers should alter the default behaviour simply so
that you can continue paying money for service that is clearly
inadequate to your needs.

brian

Re: Please change default characterset for database cluster

От
Ron Johnson
Дата:
On 09/30/07 10:31, brian wrote:
[snip]
>
> The default for MySQL is latin1 with swedish sorting.

Yorn desh born, der ritt de gitt der gue
Orn desh, dee born desh, de umn børk! børk! børk!

--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!


Re: Please change default characterset for database cluster

От
Anton Andreev
Дата:
Hi,

Get a VPS - Virtual Private Server. Mine is 29$ and it is fine for 480MB
RAM and enough disk space.  I am a full admin on my server, so I install
and configure Postgresql without problem.

YES! I agree that the default encoding must be UTF-8. I started using
Postgresql, cause I had problems with encodings in Mysql! :)

p.s. I had problems indeed. On my VPS I am unable to change the language
for non-unicode programs, so it is left to English United States, which
bothers Postgresql with my database encoding WIN, but that is another story.

Ron Johnson wrote:
> On 09/30/07 10:31, brian wrote:
> [snip]
>
>> The default for MySQL is latin1 with swedish sorting.
>>
>
> Yorn desh born, der ritt de gitt der gue
> Orn desh, dee born desh, de umn børk! børk! børk!
>
>



Re: Please change default characterset for database cluster

От
Martijn van Oosterhout
Дата:
On Sun, Sep 30, 2007 at 11:55:00AM +0800, CN wrote:
> Various people have various perceptions. I don't feel that my suggestion
> only serves to make PostgreSQL become a software product fitting only
> *myself*. On the contrary, I believe PostgreSQL will become suitable for
> more novice users if initdb will use UNICODE as the default characterset
> when it is not given option "-E" and when it can not correctly derive a
> characterset from locale.

At the end of the day the problem is incompatability between locale and
encoding. If the webmaster can't get the encoding right then they sure
as hell arn't going to get the locale right. And mismatched
encoding/locale will give you even worse problems.

It is a problem with postgres that it relies on the OS for these
things. Patches have been floated over the years and rejected for all
sorts of reasons. Maybe somebody it will get fixed.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Вложения