Re: [HACKERS] Re: locales and MB (was: Postgres 6.5 beta2 and beta3 problem)

Поиск
Список
Период
Сортировка
От Tatsuo Ishii
Тема Re: [HACKERS] Re: locales and MB (was: Postgres 6.5 beta2 and beta3 problem)
Дата
Msg-id 199906111514.AAA00712@ext16.sra.co.jp
обсуждение исходный текст
Ответ на Re: [HACKERS] Re: locales and MB (was: Postgres 6.5 beta2 and beta3 problem)  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: [HACKERS] Re: locales and MB (was: Postgres 6.5 beta2 and beta3 problem)
Список pgsql-hackers
> Tatsuo Ishii <t-ishii@sra.co.jp> writes:
> > Currently the mb support allows serveral internal
> > encodings including Unicode and mule-internal-code.
> > (yes, you can do regexp/like to Unicode data if mb support is
> > enabled).
> 
> One of the things that bothers me about makeIndexable() is that it
> doesn't seem to be multibyte-aware; does it really work in MB case?

Yes. This is because I carefully choose multibyte encodings for
the backend that have following characteristics:

o if the 8th bit of a byte is off then it is a ascii character
o otherwise it is part of non ascii multibyte characters

With these assumptions, makeIndexable() works very well with multibyte
chars.

Not all multibyte encodings satisfy above conditions. For example,
SJIS (an encoding for Japanese) and Big5 (for traditional Chinese)
does not satisfies those requirements. In these encodings the first
byte of the double byte is always 8th bit on. However in second byte
sometimes 8th bit is off: this means we cannot distinguish it from
ascii since it may accidentally matches a bit pattern of an ascii
char. This is why I do not allow SJIS and Big5 as the server
encodings.  Users can use SJIS and Big5 for the client encoding,
however.

You might ask why I don't make makeIndexable() multibyte-aware.  It
definitely possible. But you should know there are many places that
need to be multibyte-aware in this sence. The parser is one of the
good example. Making everything in the backend multibyte-aware is not
worse to do, in my opinion.
---
Tatsuo Ishii


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: [HACKERS] missing #endif in win32 specific headers
Следующее
От: Thomas Lockhart
Дата:
Сообщение: Re: [HACKERS] Postgres 6.5 beta2 and beta3 problem