Re: Impact of UNICODE encoding on performance

Поиск
Список
Период
Сортировка
От Reshat Sabiq
Тема Re: Impact of UNICODE encoding on performance
Дата
Msg-id 40591712.3040905@purdue.edu
обсуждение исходный текст
Ответ на Re: Impact of UNICODE encoding on performance  (Aarni Ruuhimäki <aarni.ruuhimaki@kymi.com>)
Ответы Re: Impact of UNICODE encoding on performance  (Harry Mantheakis <harry@mantheakis.freeserve.co.uk>)
Список pgsql-novice
I'm not very knowledgeable on this, but i think you should try UTF-8 from the start, given your expectations. I am able to save UTF-8 strings into LATIN-1 db, and retrieve them, using JDBC, but viewing them in pgAdmin III is not a pretty site (understandably). But i haven't used it extensively, and i think that queries (comparisons) might be affected with this setup (i.e., a string with 2 characters corresponding to 1 UTF-8 character would be equal to the its UTF-8 counterpart, which is clearly not intended). On the other hand, it is also conceivable that queries won't be affected, if no meaningless overlaps like that can occur.
In general, i read that Unicode is somewhat slower (understandably), but i don't think it's significant. One just needs to have a senseful character comparison method that does a bitmap first, so i don't think the overhead is big. There are probably studies on the web.

-- 
Sincerely,
Reshat.

---
If you see my certificate with this message, you should be able to send me encrypted e-mail. 
Please consult your e-mail client for details if you would like to do that.


Aarni Ruuhimäki wrote:
Hi Harry,

Dunno about the performance penalty, but so far I am happy with LATIN1 dbase 
system (RH and Trustix). Even with cyrillic characters. Then again, I work 
with browser interfaces and it's not really up to me what encoding the client 
has or has not installed. <if western, charset=iso-iso-8859-1, if fellow 
russki harasoo charset=windows-1251> is, I guess, a good bet. It's a windows 
world, so far.

Soviet KOI-X X, KOI8-r, KOI8-RU, Mac Cyrillic (Standard), CyrWin Cyrillic and 
the rest of the soup ...

Some experience and my half a pea.

BR,

Aarni


On Tuesday 16 March 2004 12:43, you wrote: 
Hello

I am just setting out on a new project, having recently switched to
PostgreSQL.

My immediate requirements would be satisfied with ISO-8859-1 (LATIN-1)
encoding, but it is conceivable that, if things go really well, somewhere
in the future my character encoding requirements will broaden.

So I am tempted to specify UNICODE form the outset, and be done with it.

But I cannot help wondering how much of a performance penalty this entails.

If the performance hit is not significant, I shall be happy to stick with
UNICODE.

But if anyone has any strong views (or experience) on this issue I shall be
very grateful for some feedback.

Many thanks.

Harry Mantheakis
London, UK


---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings   
 

Вложения

В списке pgsql-novice по дате отправления:

Предыдущее
От: Harry Mantheakis
Дата:
Сообщение: Re: Impact of UNICODE encoding on performance
Следующее
От: Harry Mantheakis
Дата:
Сообщение: Re: Impact of UNICODE encoding on performance