> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org
> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Bruce Momjian
> Sent: Sunday, April 10, 2005 8:18 AM
> To: Christopher Kings-Lynne
> Cc: pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Unicode problems on IRC
>
> Christopher Kings-Lynne wrote:
> > Hey guys,
> >
> > The 'Unicode characters above 0x10000' issue keeps rearing its ugly
> > head in the IRC channel. I propose that it be fixed, even
> backported...
> >
> > This is John Hansen's most recent patch to fix it:
> >
> > http://archives.postgresql.org/pgsql-patches/2004-11/msg00259.php
> >
> > And from what I can tell it was committed, then reverted because it
> > wasn't a "bug". It was going to go in for 8.1.
> >
> > We on the channel are starting to think that it is in fact a bug.
> > There are are people with legitimately utf-8 encoded XML documents
> > that they cannot store in PostgreSQL. Apparently in the
> distant past,
> > Unicode was limited to 0x10000, but then was extended.
> >
> > Perhaps we can reopen this case...
>
> Uh, I thought we fixed this another way, buy not using
> Unicode-aware functions for upper/lower/initcap when the
> locale is "C" or "POSIX".
> That is backpatched to 8.0.X. Does that not fix the problem reported?
No, as andrew said, what this patch does, is allow values > 0xffff and
at the same time validates the input to make sure it's valid utf8.
... John
>
> --
> Bruce Momjian | http://candle.pha.pa.us
> pgman@candle.pha.pa.us | (610) 359-1001
> + If your life is a hard drive, | 13 Roberts Road
> + Christ can be your backup. | Newtown Square,
> Pennsylvania 19073
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index
> scan if your
> joining column's datatypes do not match
>
>