Matthew Gabeler-Lee <mgabelerlee@zycos.com> writes:
> foo LIKE 'ab%' can be safely optimized in en_US since it only has
> chars that sort simply. Finding which chars sort simply should be
> straight forward and only require testing 2^8 or maybe 2^16 values.
We have several times thought we had a solution to LIKE optimization
in non-C locales, only to discover the hard way that our solution
didn't cover all cases. After that bitter experience, I am *deeply*
distrustful of any proposal that involves black-box testing of locale
behavior. A black-box test will only find the behaviors that you
already know about.
The only thing that would make me comfortable is a white-box approach:
go in and read the source code to find out what collation behaviors are
possible (I'm assuming here that commercial locale libraries don't have
anything that's not present in an open-source one, eg glibc's locale
code). Once we have that information for sure, we can think about how
to detect and deal with the different sorting oddities. But right now
I have no confidence that we know what we need to deal with.
regards, tom lane