Re: unexpected result from to_tsvector
От | Dmitrii Golub |
---|---|
Тема | Re: unexpected result from to_tsvector |
Дата | |
Msg-id | CAN1orqkB4ozwC_uer-FDrr3qecVkZfULj-F1PdW5Qx=F_SBRjQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: unexpected result from to_tsvector ("Shulgin, Oleksandr" <oleksandr.shulgin@zalando.de>) |
Ответы |
Re: unexpected result from to_tsvector
("Shulgin, Oleksandr" <oleksandr.shulgin@zalando.de>)
|
Список | pgsql-hackers |
2016-03-14 16:22 GMT+03:00 Shulgin, Oleksandr <oleksandr.shulgin@zalando.de>:
On Mon, Mar 7, 2016 at 10:46 PM, Artur Zakirov <a.zakirov@postgrespro.ru> wrote:Hello,
On 07.03.2016 23:55, Dmitrii Golub wrote:
Hello,
Should we added tests for this case?
I think we should. I have added tests for teodor@123-stack.net and 123@stack.net emails.
123_reg.ro <http://123_reg.ro> is not valid domain name, bacause of
symbol "_"
https://tools.ietf.org/html/rfc1035 page 8.
Dmitrii Golub
Thank you for the information. Fixed.Hm... now that doesn't look all that consistent to me (after applying the patch):=# select ts_debug('simple', 'aaa@123-yyy.zzz');ts_debug---------------------------------------------------------------------------(email,"Email address",aaa@123-yyy.zzz,{simple},simple,{aaa@123-yyy.zzz})(1 row)But:=# select ts_debug('simple', 'aaa@123_yyy.zzz');ts_debug---------------------------------------------------------(asciiword,"Word, all ASCII",aaa,{simple},simple,{aaa})(blank,"Space symbols",@,{},,)(uint,"Unsigned integer",123,{simple},simple,{123})(blank,"Space symbols",_,{},,)(host,Host,yyy.zzz,{simple},simple,{yyy.zzz})(5 rows)One can also see that if we only keep the domain name, the result is similar:=# select ts_debug('simple', '123-yyy.zzz');ts_debug-------------------------------------------------------(host,Host,123-yyy.zzz,{simple},simple,{123-yyy.zzz})(1 row)=# select ts_debug('simple', '123_yyy.zzz');ts_debug-----------------------------------------------------(uint,"Unsigned integer",123,{simple},simple,{123})(blank,"Space symbols",_,{},,)(host,Host,yyy.zzz,{simple},simple,{yyy.zzz})(3 rows)But, this only has to do with 123 being recognized as a number, not with the underscore:=# select ts_debug('simple', 'abc_yyy.zzz');ts_debug-------------------------------------------------------(host,Host,abc_yyy.zzz,{simple},simple,{abc_yyy.zzz})(1 row)
=# select ts_debug('simple', '1abc_yyy.zzz');ts_debug-------------------------------------------------------(host,Host,1abc_yyy.zzz,{simple},simple,{1abc_yyy.zzz})(1 row)In fact, the 123-yyy.zzz domain is not valid either according to the RFC (subdomain can't start with a digit), but since we already allow it, should we not allow 123_yyy.zzz to be recognized as a Host? Then why not recognize aaa@123_yyy.zzz as an email address?Another option is to prohibit underscore in recognized host names, but this has more breakage potential IMO.--Alex
Alex, actually subdomain can start with digit, try it.
В списке pgsql-hackers по дате отправления: