Re: Queries with Regular Expressions

Поиск
Список
Период
Сортировка
От John D. Burger
Тема Re: Queries with Regular Expressions
Дата
Msg-id eba60a0256eed477b6be464374a7594c@mitre.org
обсуждение исходный текст
Ответ на Queries with Regular Expressions  ("Silas Justiniano" <silasju@gmail.com>)
Ответы Re: Queries with Regular Expressions  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Queries with Regular Expressions  ("Silas Justiniano" <silasju@gmail.com>)
Список pgsql-general
> But I just can't make it work correctly using brackets:
> SELECT field FROM table WHERE field ~* 'ch[aã]o';
>
> It just returns tuples that have 'chao', but not 'chão'.
>
> My queries are utf-8 an the database is SQL_ASCII.

I suspect the bracketed expression is turning into [aXY], where XY is
the two-byte sequence corresponding to ã in UTF8.  So the regular
expression is only going to match strings of the form chao, chXo and
chYo.  To make sure that this is what's happening, try this:

   select length('ã');

I bet you get back 2, not 1.  I don't know if a UTF8 database will
handle this correctly or not.  The safest thing to do may be to use
queries like this:

   SELECT field FROM table WHERE field ~* 'ch(a|ã)o';

- John D. Burger
   MITRE

В списке pgsql-general по дате отправления:

Предыдущее
От: David Gama Rodrí­guez
Дата:
Сообщение: %Re: % tsearch gendict
Следующее
От: Scott Ribe
Дата:
Сообщение: Re: "Upcalls" (sort of) from the database