Обсуждение: Translate problems

Поиск
Список
Период
Сортировка

Translate problems

От
Leandro Fanzone
Дата:
I have a table with a text field:

CREATE TABLE mytable(myfield text);

 I want to perform over it a search of a kind I call "foreign characters
insensitive". My native language is Spanish, and we have accented
characters (acute accent over vowels), for example. As not everyone who
will use the application is completely aware of the accentuation rules,
data can be entered with inaccuracies regarding the accents, and also
those who search data can fail to reproduce the accentuation of the
target text being searched. So the solution would be to transform the
data to a neutral field where each vowel that is found with accent would
be transformed to its corresponding vowel without the accent, both in
the selected field and in the text to be compared to. So far, so good.
There is a builtin function called "translate", and this selection works OK:

SELECT myfield FROM mytable
WHERE TRANSLATE(LOWER(myfield), '[accented vowels collection]',
'[respetive non-accented vowels]')
LIKE TRANSLATE(LOWER('something%'), '[accented vowels collection]',
'[respetive non-accented vowels]');

where "[accented vowels collection]" and "[respetive non-accented
vowels]" are the real characters, naturally. Now, as I want to optimize
this search, I would like to create an index to it. Before there was this:

CREATE INDEX myfield_index ON mytable(LOWER(myfield));

to perform indexed searches on queries like

SELECT myfield FROM mytable
WHERE myfield LIKE 'something%';

But when I tried to create an index using TRANSLATE:

CREATE INDEX myfield_index ON mytable(TRANSLATE(LOWER(myfield),
'[accented vowels collection]', '[respetive non-accented vowels]'));

I had this error:

ERROR: parser: parse error at or near "("

So I created a function to do the work:

CREATE FUNCTION  plain_text(text) RETURNS text AS '
BEGIN
    RETURN TRANSLATE(LOWER($1), ''[accented vowels collection]'',
''[respetive non-accented vowels]'');
END;
' LANGUAGE 'plpgsql' WITH(iscachable);

and built an index with it, and everything worked perfect. I wonder why
plain TRANSLATE doesn't work within the index creation, just curiousity.
I suppose using directly "translate" should be faster than calling plpgsql.
By the way, the creation of an index using a function is a great feature
for things like this!

Leandro Fanzone.


Re: Translate problems

От
Manuel Sugawara
Дата:
Leandro Fanzone <leandro@hasar.com> writes:

> I have a table with a text field:
>
[...]
>
> SELECT myfield FROM mytable
> WHERE TRANSLATE(LOWER(myfield), '[accented vowels collection]', '[respetive
> non-accented vowels]')

I faced the same problem and, equivalente to your approach, I used a
combination of to_ascii and upper/lower:

test=# select to_ascii('áéíóúñ');
 to_ascii
----------
 aeioun
(1 row)

create or replace function to_uascii( text ) as '
select upper(to_ascii($1)) as result
' language 'sql';

I'm wondering whatever this approach works for multibyte encodings
(UTF-8 and such).

Regards,
Manuel.

Re: Translate problems

От
Leandro Fanzone
Дата:
Well, it requires multibyte: when I tried to run it in my database, I had

ERROR: Multi-byte support is not enabled.

Manuel Sugawara wrote:

>Leandro Fanzone <leandro@hasar.com> writes:
>
>
>
>>I have a table with a text field:
>>
>>
>>
>[...]
>
>
>>SELECT myfield FROM mytable
>>WHERE TRANSLATE(LOWER(myfield), '[accented vowels collection]', '[respetive
>>non-accented vowels]')
>>
>>
>
>I faced the same problem and, equivalente to your approach, I used a
>combination of to_ascii and upper/lower:
>
>test=# select to_ascii('áéíóúñ');
> to_ascii
>----------
> aeioun
>(1 row)
>
>create or replace function to_uascii( text ) as '
>select upper(to_ascii($1)) as result
>' language 'sql';
>
>I'm wondering whatever this approach works for multibyte encodings
>(UTF-8 and such).
>
>Regards,
>Manuel.
>
>---------------------------(end of broadcast)---------------------------
>TIP 2: you can get off all lists at once with the unregister command
>    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>
>
>


Re: Translate problems

От
Manuel Sugawara
Дата:
Leandro Fanzone <leandro@hasar.com> writes:
[...]
> CREATE INDEX myfield_index ON mytable(LOWER(myfield));
>
> to perform indexed searches on queries like
>
> SELECT myfield FROM mytable
> WHERE myfield LIKE 'something%';
>

By the way: the above query will use the index only if you are using
the C locale :-(

Regards,
Manuel.