Re: Unicode combining characters

Поиск
Список
Период
Сортировка
От Bruce Momjian
Тема Re: Unicode combining characters
Дата
Msg-id 200110011755.f91HtAd10383@candle.pha.pa.us
обсуждение исходный текст
Ответ на Re: Unicode combining characters  (Tatsuo Ishii <t-ishii@sra.co.jp>)
Ответы Re: Unicode combining characters  (Tatsuo Ishii <t-ishii@sra.co.jp>)
Список pgsql-hackers
Can someone give me TODO items for this discussion?

> > So, this shows two problems :
> > 
> > - length() on the server side doesn't handle correctly Unicode [I have
> >   the same result with char_length()], and returns the number of chars
> >   (as it is however advertised to do), rather the length of the
> >   string.
> 
> This is a known limitation.
> 
> > - the psql frontend makes the same mistake.
> >
> > I am using version 7.1.3 (debian sid), so it may have been corrected
> > in the meantime (in this case, I apologise, but I have only recently
> > started again to use PostgreSQL and I haven't followed -hackers long
> > enough).
> > 
> > 
> > => I think fixing psql shouldn't be too complicated, as the glibc
> > should be providing the locale, and return the right values (is this
> > the case ? and what happens for combined latin + chinese characters
> > for example ? I'll have to try that later). If it's not fixed already,
> > do you want me to look at this ? [it will take some time, as I haven't
> > set up any development environment for postgres yet, and I'm away for
> > one week from thursday].
> 
> Sounds great.
> 
> > I was wondering if some people have already thought about this, or
> > already done something, or if some of you are interested in this. If
> > nobody does anything, I'll do something eventually, probably before
> > Christmas (I don't have much time for this, and I don't need the
> > functionality right now), but if there is an interest, I could team
> > with others and develop it faster :)
> 
> I'm very interested in your point. I will start studying [1][2] after
> the beta freeze.
> 
> > Anyway, I'm open to suggestions :
> > 
> > - implement it in C, in the core,
> > 
> > - implement it in C, as contributed custom functions,
> 
> This may be a good starting point.
> 
> > I can't really accept a solution which would rely on the underlaying
> > libc, as it may not provide the necessary locales (or maybe, then,
> 
> I totally agree here.
> 
> > The main functions I foresee are :
> > 
> > - provide a normalisation function to all 4 forms,
> > 
> > - provide a collation_key(text, language) function, as the calculation
> >   of the key may be expensive, some may want to index on the result (I
> >   would :) ),
> > 
> > - provide a collation algorithm, using the two previous facilities,
> >   which can do primary to tertiary collation (cf TR#10 for a detailed
> >   explanation).
> > 
> > I haven't looked at PostgreSQL code yet (shame !), so I may be
> > completely off-track, in which case I'll retract myself and won't
> > bother you again (on that subject, that is ;) )...
> > 
> > Comments ?
> --
> Tatsuo Ishii
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: ftp.postgresql.org points to new server ...
Следующее
От: Thomas Lockhart
Дата:
Сообщение: Re: Preparation for Beta