Re: how to ignore accents?
От | Michael Fuhr |
---|---|
Тема | Re: how to ignore accents? |
Дата | |
Msg-id | 20050331004236.GA37961@winnie.fuhr.org обсуждение исходный текст |
Ответ на | Re: how to ignore accents? (Ennio-Sr <nasr.laili@tin.it>) |
Ответы |
Re: how to ignore accents?
("Celia McInnis" <celia@drmath.ca>)
Re: how to ignore accents? (Ennio-Sr <nasr.laili@tin.it>) |
Список | pgsql-novice |
On Thu, Mar 31, 2005 at 02:01:45AM +0200, Ennio-Sr wrote: > * Celia McInnis <celia@drmath.ca> [300305, 12:29]: > > I have French text with accents in it and would rather (for search purposes) > > be able to search for things with the accents removed. Is there any builtin > > postgres function which will do this? > > > > Here is the format of some accented (French) words in my database: > > > > fran\347ais caf\351 t\364\t br\373le r\352ver \342nes b\350gues > > > > which I want to be able to search for as: > > > > francais cafe tot brule rever anes begues > > I'm not an expert, but I'm afraid it's not possible to do that (and I'd > be glad if somebody came out saying I'm wrong as I've the same problem). You could write an unaccent() function that calls translate(). http://www.postgresql.org/docs/8.0/interactive/functions-string.html Here's an (incomplete) example that works in PostgreSQL 8.0.2beta1: CREATE FUNCTION unaccent(text) RETURNS text AS $$ BEGIN RETURN translate($1, '\342\347\350\351\352\364\373', 'aceeeou'); END; $$ LANGUAGE plpgsql IMMUTABLE STRICT; SELECT unaccent('fran\347ais'); unaccent ---------- francais (1 row) You could use unaccent() in a functional index: CREATE TABLE words ( id serial PRIMARY KEY, word text NOT NULL ); CREATE INDEX words_word_idx ON words (unaccent(word)); INSERT INTO words (word) VALUES ('fran\347ais'); INSERT INTO words (word) VALUES ('caf\351'); INSERT INTO words (word) VALUES ('t\364\t'); SELECT * FROM words WHERE unaccent(word) = 'francais'; id | word ----+---------- 1 | français (1 row) EXPLAIN shows that the index is being used: EXPLAIN SELECT * FROM words WHERE unaccent(word) = 'francais'; QUERY PLAN ------------------------------------------------------------------------------ Index Scan using words_word_idx on words (cost=0.00..11.31 rows=6 width=36) Index Cond: (unaccent(word) = 'francais'::text) (2 rows) Hope this helps. -- Michael Fuhr http://www.fuhr.org/~mfuhr/
В списке pgsql-novice по дате отправления: