Re: Unicode upper() bug still present

Поиск
Список
Период
Сортировка
От Hannu Krosing
Тема Re: Unicode upper() bug still present
Дата
Msg-id 1066653765.4888.6.camel@fuji.krosing.net
обсуждение исходный текст
Ответ на Re: Unicode upper() bug still present  (Tatsuo Ishii <t-ishii@sra.co.jp>)
Ответы Re: Unicode upper() bug still present  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Tatsuo Ishii kirjutas E, 20.10.2003 kell 15:37:
> > Tom Lane kirjutas E, 20.10.2003 kell 03:35:
> > > Oliver Elphick <olly@lfix.co.uk> writes:
> > > > There is a bug in Unicode upper() which has been present since 7.2:
> > > 
> > > We don't support upper/lower in multibyte character sets, and can't as
> > > long as the functionality is dependent on <ctype.h>'s toupper()/tolower().
> > > It's been suggested that we could use <wctype.h> where available.
> > > However there are a bunch of issues that would have to be solved to make
> > > that happen.  (How do we convert between the database character encoding 
> > > and the wctype representation?  
> > 
> > How do we do it for sorting ?
> > 
> > > How do we even find out what
> > > representation the current locale setting expects to use?)
> > 
> > Why not use the same locale settings as for sorting (i.e. databse
> > encoding) until we have a proper multi-locale support in the backend ?
> 
> There's absolutely no relationship between database encoding and
> locale. 

How does the system then use locale for sorting and not for upper/lower
?

I would have rather expected the opposite, as lower/uper rules are litte
more locale independent than collation.

> IMO depending on the system locale is a completely wrong
> design decision and we should go toward for having our own collate
> data.  

I agree completely. We could probably lift something from IBM's ICU.

-----------------
Hannu



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tatsuo Ishii
Дата:
Сообщение: Re: Unicode upper() bug still present
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Unicode upper() bug still present