psql's display of Unicode combining characters appears to have
changed in 8.2. For example, I'd expect <U+006E LATIN SMALL LETTER N,
U+0303 COMBINING TILDE> to display the same as the precomposed
<U+00F1 LATIN SMALL LETTER N WITH TILDE>. With 8.1's psql they do,
but with 8.2's psql this sequence displays as:
SELECT E'n\314\203'; -- \314\203 = UTF-8 encoding of U+0303?column?
----------n\u0303
(1 row)
(I'm testing with both server and client using UTF-8.)
This excerpt from pg_wcsformat() in mbprint.c looks responsible:
else if (w <= 0) /* Non-ascii control char */ { if (encoding == PG_UTF8) sprintf((char *)
ptr,"\\u%04X", utf2ucs(pwcs));
This might be the relevant commit:
http://archives.postgresql.org/pgsql-committers/2006-02/msg00089.php
Should the code distinguish between combining characters and
zero-width control characters so the former display correctly?
--
Michael Fuhr