Re: Text search parser's treatment of URLs and emails

Поиск
Список
Период
Сортировка
От Thom Brown
Тема Re: Text search parser's treatment of URLs and emails
Дата
Msg-id AANLkTinTqNEZG2jwb_R2xB5GSL56Q=VkFOV6U6+qQh1U@mail.gmail.com
обсуждение исходный текст
Ответ на Text search parser's treatment of URLs and emails  (Thom Brown <thom@linux.com>)
Список pgsql-general
On 8 September 2010 21:48, Thom Brown <thom@linux.com> wrote:
> Hi,
>
> I noticed that if I run this:
>
> SELECT alias, description, token FROM
> ts_debug('http://www.postgresql.org:2345/directory/page.html?version=9.1&build=alpha1#summary');
>
> I get:
>
>  alias   |  description  |                              token
> ----------+---------------+-----------------------------------------------------------------
>  protocol | Protocol head | http://
>  url      | URL           |
> www.postgresql.org:2345/directory/page.html?version=9.1&build=alpha1#summary
>  host     | Host          | www.postgresql.org:2345
>  url_path | URL path      |
> /directory/page.html?version=9.1&build=alpha1#summary
> (4 rows)
>
>
> It could be me being picky, but I don't regard parameters or page
> fragments as part of the URL path.  Ideally, I'd sort of expect:
>
>    alias     |  description  |                              token
> --------------+---------------+-----------------------------------------------------------------
>  protocol     | Protocol head | http://
>  url          | URL           |
> www.postgresql.org:2345/directory/page.html?version=9.1&build=alpha1#summary
>  host         | Host          | www.postgresql.org
>  port         | Port          | 2345
>  url_path     | URL path      | /directory/page.html
>  query_string | Query string  | version=9.1&build=alpha1
>  fragment     | Page fragment | summary
> (7 rows)
>
> ... of course that's if there was support for query strings and page
> fragments, which there isn't.  But if changes were made to support my
> definition of a URL path, they'd have to be considered breaking
> changes.
>
> But my main gripe is with the name "url_path".
>
> Also:
>
> SELECT alias, description, token FROM ts_debug('myname+priority@gmail.com');
>
> Yields:
>
>   alias   |   description   |       token
> -----------+-----------------+--------------------
>  asciiword | Word, all ASCII | myname
>  blank     | Space symbols   | +
>  email     | Email address   | priority@gmail.com
> (3 rows)
>
> The entire string I entered is a valid email address, and isn't
> totally uncommon.  Shouldn't that take such email address styles be
> taken into account?  The example above incorrectly identifies the
> email address since the real destination address would most likely be
> myname@gmail.com.

No opinions?

--
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

В списке pgsql-general по дате отправления:

Предыдущее
От: "Vishnu S."
Дата:
Сообщение: Re: Slony-I installation Help
Следующее
От: Dave Page
Дата:
Сообщение: Re: PostgreSQL 9 Mac OS X one-click install - PL/perl broken