Re: how to ignore accents?

Поиск
Список
Период
Сортировка
От Celia McInnis
Тема Re: how to ignore accents?
Дата
Msg-id 20050331133155.M49612@drmath.ca
обсуждение исходный текст
Ответ на Re: how to ignore accents?  (Michael Fuhr <mike@fuhr.org>)
Ответы Re: how to ignore accents?  (Michael Fuhr <mike@fuhr.org>)
Список pgsql-novice
Thanks, Michael - I just tried your suggestion in postgresql 8.0.1 and
unfortunately it doesn't work. How long will it be before 8.0.2 moves out of
beta mode?

Here's the error which I get in 8.0.1:

ERROR: invalid byte sequence for encoding "UNICODE": 0xe76169

Is there something which I can do in 8.0.1 to have the unaccenting work?

Celia McInnis

On Wed, 30 Mar 2005 17:42:36 -0700, Michael Fuhr wrote
> You could write an unaccent() function that calls translate().
>
>   http://www.postgresql.org/docs/8.0/interactive/functions-string.html
>
> Here's an (incomplete) example that works in PostgreSQL 8.0.2beta1:
>
>   CREATE FUNCTION unaccent(text) RETURNS text AS $$
>   BEGIN
>       RETURN translate($1, '\342\347\350\351\352\364\373', 'aceeeou')
> ;  END;  $$ LANGUAGE plpgsql IMMUTABLE STRICT;
>
>   SELECT unaccent('fran\347ais');
>    unaccent
>   ----------
>    francais
>   (1 row)
>
> You could use unaccent() in a functional index:
>
>   CREATE TABLE words (
>       id    serial PRIMARY KEY,
>       word  text NOT NULL
>   );
>
>   CREATE INDEX words_word_idx ON words (unaccent(word));
>
>   INSERT INTO words (word) VALUES ('fran\347ais');
>   INSERT INTO words (word) VALUES ('caf\351');
>   INSERT INTO words (word) VALUES ('t\364\t');
>
>   SELECT * FROM words WHERE unaccent(word) = 'francais';
>    id |   word
>   ----+----------
>     1 | français
>   (1 row)
>
> EXPLAIN shows that the index is being used:
>
>   EXPLAIN SELECT * FROM words WHERE unaccent(word) = 'francais';
>                                     QUERY PLAN
>
>   --------------------------------------------------------------------------
----
>    Index Scan using words_word_idx on words  (cost=0.00..11.31
> rows=6 width=36)     Index Cond: (unaccent(word) = 'francais'::text)
>   (2 rows)
>
> Hope this helps.
>
> --
> Michael Fuhr
> http://www.fuhr.org/~mfuhr/
>



> On Thu, Mar 31, 2005 at 02:01:45AM +0200, Ennio-Sr wrote:
> > * Celia McInnis <celia@drmath.ca> [300305, 12:29]:
> > > I have French text with accents in it and would rather (for search
purposes)
> > > be able to search for things with the accents removed. Is there any
builtin
> > > postgres function which will do this?
> > >
> > > Here is the format of some accented (French) words in my database:
> > >
> > > fran\347ais caf\351 t\364\t br\373le r\352ver \342nes b\350gues
> > >
> > > which I want to be able to search for as:
> > >
> > > francais cafe tot brule rever anes begues
> >

> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster


--
Open WebMail Project (http://openwebmail.org)


В списке pgsql-novice по дате отправления:

Предыдущее
От: Kim Kohen
Дата:
Сообщение: Boolean search
Следующее
От: Michael Fuhr
Дата:
Сообщение: Re: how to ignore accents?