Email parsing in Text Search

Поиск
Список
Период
Сортировка
От Martin Dubé
Тема Email parsing in Text Search
Дата
Msg-id CAGny-cMH0s4Q-Ob=Ebn+-yDchLMVEm8bZ9PBP88vEvppsh5BPw@mail.gmail.com
обсуждение исходный текст
Ответы Re: Email parsing in Text Search
Список pgsql-bugs
Hi,

I'm having a weird behavior with the email parser and wonder if it is a bug or a feature.

When using the default regconfig and parse an email where the first part is numbers only, it is not parsed as an email.

db=# select * from ts_debug('pg_catalog.english', '000000001@asdf.com');
 alias |   description    |   token   | dictionaries | dictionary |   lexemes   
-------+------------------+-----------+--------------+------------+-------------
 uint  | Unsigned integer | 000000001 | {simple}     | simple     | {000000001}
 blank | Space symbols    | @         | {}           |            | 
 host  | Host             | asdf.com  | {simple}     | simple     | {asdf.com}
(3 rows)


However, if I add a letter, it is parsed as an email.

db=# select * from ts_debug('pg_catalog.english', '000000001a@asdf.com');
 alias |  description  |        token        | dictionaries | dictionary |        lexemes        
-------+---------------+---------------------+--------------+------------+-----------------------
 email | Email address | 000000001a@asdf.com | {simple}     | simple     | {000000001a@asdf.com}
(1 row)

According to RFC and several forums, an email address with only numbers in the first part is valid. 

Is it a normal behavior?

I did the test on OpenBSD 5.9 and postgresql is at version 9.4.6.

Thanks,


--
Mart

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Olivier Dony
Дата:
Сообщение: Re: Serialization failures on PQ9.5
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Email parsing in Text Search