Обсуждение: Regular Expression in SQL

Поиск
Список
Период
Сортировка

Regular Expression in SQL

От
"Aaron Bono"
Дата:
I recall not long ago a discussion about regular expressions in a query that hit on this exact topic but don't think it
wasever resolved so I am giving it a go again...<br /><br />Here is my query (keep in mind that I am just experimenting
nowso don't worry about the fact that I am using nested substring function calls): <br /><br />SELECT<br />   
referrer,<br/>    substring(referrer FROM '^([^\\/]*\\/\\/[^\\/]*)(\\/)?'),<br />    substring(referrer FROM
'^[^\\/]*\\/\\/www\\.google\\.[^\\/]*\\/[^\\?]*\\?(.*)$'),<br/>    substring(substring(referrer FROM
'^[^\\/]*\\/\\/www\\.google\\.[^\\/]*\\/[^\\?]*\\?(.*)$')FROM '((%&q=)|(q=))#"[^&]*#"((&%)|())' FOR '#')
<br/>FROM one_hour_air.web_page_view<br />WHERE referrer ~ '^[^\\/]*\\/\\/(www.google\\.[^\\/]*)\\/'<br /><br />What I
getis:<br /><br /><div style="margin-left: 40px;">referrer <br />substring<br />substring_1<br />substring_2<br /><br
/><ahref="http://www.google.ca/search?q=one+hour+heating&hl=en">
http://www.google.ca/search?q=one+hour+heating&hl=en</a><br/><a
href="http://www.google.ca">http://www.google.ca</a><br/>q=one+hour+heating&hl=en<br />q=<br /><br /><a
href="http://www.google.com/search?hl=en&q=One+hour+heating+and+Air">
http://www.google.com/search?hl=en&q=One+hour+heating+and+Air</a><br/><a
href="http://www.google.com">http://www.google.com</a><br/>hl=en&q=One+hour+heating+and+Air<br />hl=en&q=<br
/></div><br/>What I expected for substring_2 was (respectively): <br /><br /><div style="margin-left:
40px;">one+hour+heating<br/></div><div style="margin-left: 40px;"><br /></div><div style="margin-left:
40px;">One+hour+heating+and+Air<br/></div><br />I thought by using the FOR '#' I could specify exactly what part of the
expressionI would get but it still grabs the first (...) of the pattern.  At least that is what the documentation in
seciton9.7.2 at <a
href="http://www.postgresql.org/docs/8.1/static/functions-matching.html">http://www.postgresql.org/docs/8.1/static/functions-matching.html</a>
ledme to believe.  How can I get the part of the string I am really after without using one nested substring after
another?<br /><br />Thanks,<br />Aaron Bono<br /><br
/>==================================================================<br/>   Aaron Bono<br />   Aranya Software
Technologies,Inc.<br />   <a href="http://www.aranya.com">http://www.aranya.com </a><br
/>================================================================== 

Re: Regular Expression in SQL

От
Tom Lane
Дата:
"Aaron Bono" <postgresql@aranya.com> writes:
> I thought by using the FOR '#' I could specify exactly what part of the
> expression I would get but it still grabs the first (...) of the pattern.

Hmm ... I think that this is a bug in similar_escape(): it ought to
transform parentheses in a SIMILAR pattern into non-capturing parentheses.

Until this is fixed, your best bet is to use the POSIX-regexp form of
substring().  You can't sneak non-capturing parens through
similar_escape, because it'll try to escape the ? ...
        regards, tom lane