> So let me cut to the chase: I'm thinking that rather than store the
> actual character sequence of each field (or some subset of a field)
> in an index why not translate the characters into their collation
> sequence values and store _those_ in the index?
This is not an obvious win, since:
1. some collations rules require multiple passes over the data
2. POSIX strxfrm() will convert strings of characters to a form that can be compared by strcmp() [i.e. single pass]
buttends to greatly increase memory requirements
I've only data for one implementation of strxfrm(), but the memory usage startled me. In my application it was
fasterto use strcoll() directly for collation than to pre-expand the data with strxfrm().
Regards,
Giles