Обсуждение: Please change default characterset for database cluster
Hi! "initdb" use SQL_ASCII as the default characterset encoding when it is not given option "-E" and when it can not correctly derive one from locale. I suggest "initdb" use UNICODE instead of SQL_ASCII because UNICODE is far more useful than SQL_ASCII. Not all webmasters are willing to spend time reading "initdb" documentation. I have encountered a free web hosting providing PhpPgAdmin through which I can create my databases. Problem is that all newly created databases use SQL_ASCII which is completely useless to me. Their PhpPgAdmin script does not support "-E" switch for "createdb". As a result, I have to abandon that service all together. Was "initdb" using UNICODE as the default characterset, everthing would be perfect. Regards, CN -- http://www.fastmail.fm - Same, same, but different�
CN wrote: > Hi! > "initdb" use SQL_ASCII as the default characterset encoding when it is > not given option "-E" and when it can not correctly derive one from > locale. I suggest "initdb" use UNICODE instead of SQL_ASCII because > UNICODE is far more useful than SQL_ASCII. > > Not all webmasters are willing to spend time reading "initdb" > documentation. I have encountered a free web hosting providing > PhpPgAdmin through which I can create my databases. Problem is that all > newly created databases use SQL_ASCII which is completely useless to me. > Their PhpPgAdmin script does not support "-E" switch for "createdb". As > a result, I have to abandon that service all together. Was "initdb" > using UNICODE as the default characterset, everthing would be perfect. In addition to the general comment that the world does not necessarily revolve around you, and that you should not expect all software products in the world to be customized to suit *your* needs, I have to highlight how horrifying this is: > Not all webmasters are willing to spend time reading "initdb" > documentation. This is truly horrifying --- well, fortunately, one could hope that it is as wrong as the rest of your message; that dumb and lazy end users and computer illiterate people are not willing to spend time reading documentation or instructions is ok... But webmasters and database administrators??? Do you *seriously* expect that some highly complex software like a DB server should be handled by people who are not willing to read documentation???? That's the most preposterous notion I've read in the last few months! Another detail to add --- for a lot of people, Unicode is a useless feature that has a very important performance hit. For a *very large* fraction of applications, I see it generally advised to use a database with no encoding (which SQL_ASCII essentially is), and in the situations where some locale-aware processing is needed, then the client application can do it. Of course, there are also many many applications where a DB with Unicode encoding is very useful. In those cases, the administrators can create a database with Unicode encoding (you seem to be one of those that are too busy to be willing to spend time reading the documentation of *createdb*), regardless of what default encoding was specified with initdb. Oh, and BTW, welcome to version 8 of PostgreSQL ... The default encoding for initdb is ..... Ta-daaaa!!! Unicode !!! Carlos --
On Fri, Sep 28, 2007 at 09:32:43PM -0400, Carlos Moreno wrote: > Oh, and BTW, welcome to version 8 of PostgreSQL ... The default > encoding for initdb is ..... Ta-daaaa!!! Unicode !!! No, it isn't. If you get UTF8 (formerly UNICODE) as a default then it's because initdb is picking it up from your environment. http://www.postgresql.org/docs/8.2/interactive/app-initdb.html "The default is derived from the locale, or SQL_ASCII if that does not work." -- Michael Fuhr
Michael Fuhr <mike@fuhr.org> writes: > No, it isn't. If you get UTF8 (formerly UNICODE) as a default then > it's because initdb is picking it up from your environment. Which initdb has done since 8.0. If the OP is such a rabid UTF8 fan, one wonders why his default locale setting isn't using UTF8 ... regards, tom lane
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 09/28/07 21:12, Tom Lane wrote: > Michael Fuhr <mike@fuhr.org> writes: >> No, it isn't. If you get UTF8 (formerly UNICODE) as a default then >> it's because initdb is picking it up from your environment. > > Which initdb has done since 8.0. If the OP is such a rabid UTF8 fan, > one wonders why his default locale setting isn't using UTF8 ... He uses Windows? - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFG/bXTS9HxQb37XmcRAjquAJ9EkSRbr4oHmQsFgKbSl7KJzZbqwgCfWp41 6ccK8ThWWoScU9yv3nCq3xQ= =xcfs -----END PGP SIGNATURE-----
Michael Fuhr wrote: > On Fri, Sep 28, 2007 at 09:32:43PM -0400, Carlos Moreno wrote: > >> Oh, and BTW, welcome to version 8 of PostgreSQL ... The default >> encoding for initdb is ..... Ta-daaaa!!! Unicode !!! >> > > No, it isn't. If you get UTF8 (formerly UNICODE) as a default then > it's because initdb is picking it up from your environment. > > http://www.postgresql.org/docs/8.2/interactive/app-initdb.html > > "The default is derived from the locale, or SQL_ASCII if that does not work." > Right --- I made the "over-assumption" based on the fact that all the systems where I've installed it (all Fedora flavors of Linux) use UTF8 as system locale, and thus that one in a sense becomes the "default" ... Not sure about other flavors of Unix, but certainly on the Windows world all bets would be off (not like anyone would care, huh? ;-) ) Carlos --
Ron Johnson wrote: > On 09/28/07 21:12, Tom Lane wrote: >> Michael Fuhr <mike@fuhr.org> writes: >>> No, it isn't. If you get UTF8 (formerly UNICODE) as a default then >>> it's because initdb is picking it up from your environment. >> Which initdb has done since 8.0. If the OP is such a rabid UTF8 fan, >> one wonders why his default locale setting isn't using UTF8 ... > > He uses Windows? Just FYI: The next version of the Windows installer will attempt to pick up the locale from the environment. If that succeeds, it will use that locale and UNICODE encoding. Only if that fails will it pick SQL_ASCII. //Magnus
On Fri, 28 Sep 2007 21:32:43 -0400, "Carlos Moreno" <moreno_pg@mochima.com> said: > CN wrote: > > Hi! > > "initdb" use SQL_ASCII as the default characterset encoding when it is > > not given option "-E" and when it can not correctly derive one from > > locale. I suggest "initdb" use UNICODE instead of SQL_ASCII because > > UNICODE is far more useful than SQL_ASCII. > > > > In addition to the general comment that the world does not necessarily > revolve around you, and that you should not expect all software products > in the world to be customized to suit *your* needs, I have to highlight > how horrifying this is: > > > Not all webmasters are willing to spend time reading "initdb" > > documentation. > This is truly horrifying --- well, fortunately, one could hope that it > is as wrong as the rest of your message; that dumb and lazy end users > and computer illiterate people are not willing to spend time reading > documentation or instructions is ok... But webmasters and database > administrators??? Do you *seriously* expect that some highly complex > software like a DB server should be handled by people who are not > willing to read documentation???? That's the most preposterous notion > I've read in the last few months! > > Another detail to add --- for a lot of people, Unicode is a useless > feature that has a very important performance hit. For a *very large* > fraction of applications, I see it generally advised to use a database > with no encoding (which SQL_ASCII essentially is), and in the situations > where some locale-aware processing is needed, then the client > application can do it. > > Of course, there are also many many applications where a DB with > Unicode encoding is very useful. In those cases, the administrators > can create a database with Unicode encoding (you seem to be one of > those that are too busy to be willing to spend time reading the > documentation of *createdb*), regardless of what default encoding was > specified with initdb. > > Oh, and BTW, welcome to version 8 of PostgreSQL ... The default > encoding for initdb is ..... Ta-daaaa!!! Unicode !!! > > Carlos Various people have various perceptions. I don't feel that my suggestion only serves to make PostgreSQL become a software product fitting only *myself*. On the contrary, I believe PostgreSQL will become suitable for more novice users if initdb will use UNICODE as the default characterset when it is not given option "-E" and when it can not correctly derive a characterset from locale. As I stated, not all webmasters or DBA's are advanced software administrators. I wonder there are many many webmasters and DBA's in the world try to setup their web sites and only use a mouse but never use their keyboards and read manuals. And I wonder this is one of the reasons making MyZql so popular - so much popular than PostgreSQL although it is far less powerful and has much less features than the latter. I have been using PostgreSQL since 6.5.x. I chose it because I noticed that PostgresSQL was the only open source DBMS that supports subquery and user defined functions that time. But how come MyZql becomes more popular than PostgreSQL today? I have my own answers to this: 1. "MyZql" is easier to pronounce and remember than "PostgreSQL". 2. MyZql rolled out MyZql.exe earlier than PostgreSQL. Answer 1 is a very important reason but I don't intend to talk about it here. I believe MyZql's success in terms of market share is largely contributed by its Windowz product. Why? Becasue many (and perhaps most) people started their businesses by using a mouse. They are obviously not advanced DBA nor experts at the begining. However, they felt they successfully got their jobs done only with a mouse! I feel PostgreSQL can also consider this marketing strategy: As it has always been providing andvanced features for andvanced users, but also first help novices, who knows only how to use a mouse, get their jobs done. Yes, UNICODE results in poorer performance than SQL_ASCII. However, this is not a problem at all because advanced users will use "-E" when they only needs SQL_ASCII. On the contrary, novices who actually needs UNICODE but get SQL_ASCII after PostgreSQL installation usually walk away and embrace MyZql which appears to be able to always help them setup their first web site with a few mouse clicks and with all the default values prompted by MyZql-install.exe. As in my unhappy experience, the webmaster must have used initdb without "-E" option to initialized his database cluster. He also used a cPanel which does not provide "-E" option for createdb. I posted a request to that site asking for providing "-E" option for createdb by his cPanel. That webmaster said that he can not program cPanel. Another user replied me by asking: "What don't you simply use MyZql?". The net result is that I left his site and reduced the total number of PostgreSQL users from his site. Regards, CN -- http://www.fastmail.fm - The way an email service should be
>>CN wrote: > ... MyZql ... MyZql > > 1. "MyZql" is easier to pronounce and remember than "PostgreSQL". Actually, that's *MySQL*. > Yes, UNICODE results in poorer performance than SQL_ASCII. However, this > is not a problem at all because advanced users will use "-E" when they > only needs SQL_ASCII. On the contrary, novices who actually needs > UNICODE but get SQL_ASCII after PostgreSQL installation usually walk > away and embrace MyZql which appears to be able to always help them > setup their first web site with a few mouse clicks and with all the > default values prompted by MyZql-install.exe. The default for MySQL is latin1 with swedish sorting. > As in my unhappy experience, the webmaster must have used initdb without > "-E" option to initialized his database cluster. He also used a cPanel > which does not provide "-E" option for createdb. I posted a request to > that site asking for providing "-E" option for createdb by his cPanel. > That webmaster said that he can not program cPanel. Another user replied > me by asking: "What don't you simply use MyZql?". The net result is > that I left his site and reduced the total number of PostgreSQL users > from his site. You've reduced the total number of users, period. That's as it should be. It's really quite simple: if they cannot give you the level of service you require then go somewhere else. It's silly to suggest that the Postgres developers should alter the default behaviour simply so that you can continue paying money for service that is clearly inadequate to your needs. brian
On 09/30/07 10:31, brian wrote: [snip] > > The default for MySQL is latin1 with swedish sorting. Yorn desh born, der ritt de gitt der gue Orn desh, dee born desh, de umn børk! børk! børk! -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good!
Hi, Get a VPS - Virtual Private Server. Mine is 29$ and it is fine for 480MB RAM and enough disk space. I am a full admin on my server, so I install and configure Postgresql without problem. YES! I agree that the default encoding must be UTF-8. I started using Postgresql, cause I had problems with encodings in Mysql! :) p.s. I had problems indeed. On my VPS I am unable to change the language for non-unicode programs, so it is left to English United States, which bothers Postgresql with my database encoding WIN, but that is another story. Ron Johnson wrote: > On 09/30/07 10:31, brian wrote: > [snip] > >> The default for MySQL is latin1 with swedish sorting. >> > > Yorn desh born, der ritt de gitt der gue > Orn desh, dee born desh, de umn børk! børk! børk! > >
On Sun, Sep 30, 2007 at 11:55:00AM +0800, CN wrote: > Various people have various perceptions. I don't feel that my suggestion > only serves to make PostgreSQL become a software product fitting only > *myself*. On the contrary, I believe PostgreSQL will become suitable for > more novice users if initdb will use UNICODE as the default characterset > when it is not given option "-E" and when it can not correctly derive a > characterset from locale. At the end of the day the problem is incompatability between locale and encoding. If the webmaster can't get the encoding right then they sure as hell arn't going to get the locale right. And mismatched encoding/locale will give you even worse problems. It is a problem with postgres that it relies on the OS for these things. Patches have been floated over the years and rejected for all sorts of reasons. Maybe somebody it will get fixed. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.