iihero <iihero@gmail.com> writes:
> But I found new issues now. (the latest code from cvs)
> 1. file : contrib\fuzzystrmatch\dmetaphone.c,
> line: 1040 and line: 464, both look like as below,
> case '?:
> There is no the matched single quote, and the content is repeated. This
> cause build always failed for fuzzystrmatch.
Huh, interesting. Looking at these lines in a strict-C-locale editor,
I see
case '\307': case '\321':
(Emacs is rendering single-byte characters as backslash sequences.)
It appears to me that the code author was using Latin-1 and that these
characters are meant to be C-with-cedilla and N-with-tilde respectively.
It's not entirely surprising that a C compiler thinking the source file
was in UTF-8 would spit up.
We could trivially change the code to be more portable by spelling out
the characters as backslash escapes (ie, make it as I wrote above rather
than what's really there). But that's just ignoring the real problem,
which is that this code is completely broken in any database encoding
other than Latin-1. Not sure what to do about that. It doesn't look
like it'd be easy to adapt the code for multibyte operation ... and
personally I don't care enough about metaphone to put much work into it.
Anyone want to have a stab at it?
regards, tom lane