Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> I don't see how UCS-16 could always use only 2 bytes.
Simple: it fails to handle Unicode code points above 0x10000. (We only
recently fixed a similar limitation in our UTF8 support, by the by, but
it *is* fixed and I doubt we want to backpedal.)
The problem with embedded null bytes is quite serious though, and I
doubt that we'll ever see the backend natively handling encodings that
require that. It's just not worth the effort. Certainly the idea of
not having to store a length word for CHAR(1) fields is not going to
inspire anyone to invest the effort involved ;-)
Keep in mind also that any such change would involve putting slower and
more complicated logic into some routines that are hotspots already;
so even if you did all the work involved, you might find the patch
rejected on the grounds that it's a net performance loss. Most of the
developers have plenty of tasks to do with a larger and more certain
reward than this.
regards, tom lane