On Thu, Dec 03, 2009 at 08:33:38AM +0100, Kern Sibbald wrote:
> Bacula gets the "raw" filename from the OS and stores it on the Volume
> then puts it in the database. We treat the filename as if it is UTF-8
> for display purposes, but in all other cases, what we want is for the
> filename to go into the database and come back out unchanged.
How about also storing the encoding of the path/filename as well? This
would allow the restore to do the right thing for display purposes and
also when going to a system that uses a different encoding. Obviously
you wouldn't know this for Unix derivatives, but for most other systems
this would seem to help.
> On MySQL we use BLOBS. On PostgreSQL, we TEXT and set the encoding to
> SQL_ASCII so that PostgreSQL will not attempt to do any translation.
> This works well, and I hope that PostgreSQL will continue to support
> letting Bacula insert text characters in the database with no
> character encoding checks in the future.
As others have said; BYTEA is probably the best datatype for you to
use. The encoding of BYTEA literals is a bit of a fiddle and may need
some changes, but it's going to be much more faithful to your needs of
treating the filename as an opaque blob of data.
--
Sam http://samason.me.uk/