Обсуждение: New database: SQL_ASCII vs UTF-8 trade-offs
"PostgreSQL 8.1.0 on i486-pc-linux-gnu, compiled by GCC cc (GCC) 4.0.3 20051111 (prerelease) (Debian 4.0.2-4)" Hi, Am having some doubts whether a new db should be with SQL_ASCII or UTF-8 encoding. We expect ALL of our data to be ASCII. At the same time, I guess, it's possible that some user may decide to get creative and enter, for example, his own name with non-ASCII chars. So, it seems that UTF-8 would be a better choise even if we plan to store only ASCII data (a lot of ASCII data though). Are there any negative effects related to the selection of UTF-8 over SQL_ASCII (e.g. size of the database, sort/like/group issues, etc)? Thanks in advance --------------------------------------------------------------------------- __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
Am Dienstag, 7. März 2006 15:08 schrieb ow: > Are there any negative effects related to the selection of UTF-8 over > SQL_ASCII (e.g. size of the database, sort/like/group issues, etc)? If you're only planning to store ASCII data, choosing UTF-8 will not cause any additional problems. But obviously you're more future-proof that way. -- Peter Eisentraut http://developer.postgresql.org/~petere/
ow <oneway_111@yahoo.com> writes: > Are there any negative effects related to the selection of UTF-8 over SQL_ASCII There will be a speed penalty; whether it's significant in your application is something you can only determine by experiment. regards, tom lane
--- Tom Lane <tgl@sss.pgh.pa.us> wrote: > ow <oneway_111@yahoo.com> writes: > > Are there any negative effects related to the selection of UTF-8 over > SQL_ASCII > > There will be a speed penalty; whether it's significant in your > application is something you can only determine by experiment. I see... If *ALL* data is in ASCII, is it possible to just update "pg_database.encoding" to UTF-8 or will I need to recreate the db? Thanks __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
ow <oneway_111@yahoo.com> writes: > I see... If *ALL* data is in ASCII, is it possible to just update > "pg_database.encoding" to UTF-8 or will I need to recreate the db? It seems risky, but you could probably get away with that as long as the database locale (LC_COLLATE/LC_CTYPE) is "C" ... which is really the only one that's safe with SQL_ASCII anyway ... note that already-started backends will probably fail to notice such a change. regards, tom lane
--- Tom Lane <tgl@sss.pgh.pa.us> wrote: > It seems risky, but you could probably get away with that as long > as the database locale (LC_COLLATE/LC_CTYPE) is "C" ... which is really > the only one that's safe with SQL_ASCII anyway ... I actually created the cluster with: test1:~# /usr/lib/postgresql/8.1/bin/initdb --pwprompt -D /var/lib/postgresql/8.1/main/ --lc-collate=POSIX test1:~# locale LANG= LC_CTYPE="POSIX" LC_NUMERIC="POSIX" LC_TIME="POSIX" LC_COLLATE="POSIX" LC_MONETARY="POSIX" LC_MESSAGES="POSIX" LC_PAPER="POSIX" LC_NAME="POSIX" LC_ADDRESS="POSIX" LC_TELEPHONE="POSIX" LC_MEASUREMENT="POSIX" LC_IDENTIFICATION="POSIX" LC_ALL= Not sure if it's going to make a difference. Thanks __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com