Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Gavin Flower wrote:
>> Not saying there is any problem, but you might like to check how the
>> EUR currency symbol is handled (it is in LATIN2, but not in LATIN1):
> Latin1 doesn't have euro, which is why Latin9 (iso-8859-15) was invented
> IIUC.
Yeah, I doubt there's much to be learned from the euro-sign case.
The Snowball stemmers certainly don't care about euro --- they
only work with alphabetic characters.
Actually, an interesting point is that we could probably use one of the
single-byte-encoding LATIN1 stemmers when the database encoding is LATIN9,
and thereby save a translation to UTF8 and back, since the stemmer logic
isn't going to care about euro signs. Likewise for LATIN2 vs LATIN10.
Not sure it's worth the trouble though.
regards, tom lane