- - wrote:
> Dear PG hackers,
>
> I have two question regarding Unicode support in PG:
>
> 1) If I set my database and connection encoding to UTF-8, does pg (and
> future versions of it) guarantee that unicode code points are stored
> unmodified? or could it be that pg does some unicode
> normalization/manipulation with them before storing a string, or when
> retrieving a string?
>
> The reason why I'm asking is, I've built a little program that reads
> in and stores text and explicilty analyzes the text at a later point
> in time, also regarding things like if the text is in NFC, NFD or
> neither. and since I want to store them in the database, it is very
> imporant for PG not to fiddle around with the normalization unless my
> program explicitly told PG to do that.
>
> 2) How far is normalization support in PG? When I checked a long time
> ago, there was no such support. Now that the SQL standard mandates a
> NORMALIZE function that may have changed. Any updates?
>
We don't do any normalization. If the client gives us UTF8 then we store
exactly what it gives us, and return exactly that.
(This question is not really a -hackers question. The correct forum is
pgsql-general. Please make sure you use the correct forum in future.)
cheers
andrew