Обсуждение: Help with a not match
Use case:
I have a table I will call “tableA”. It has two columns. One column is an “id” that is an integer primary key. The second field is a text field called “content”.
In the “content” field there may or may not be this substring “/files/{id}/” . Note that {id} represents the field id for that record. I want to find any rows where content contains “/files/###/” where the ### does not match the id for the row. Also note that the id could be 1 to five digits. Also there could be good and bad substrings in the content field.
Example where a row is fine:
Id=123
Content=”abc”
Id=345
Content=”abc files/345/ abc files/345/”
Example where a rows are not good:
Id=123
Content=”abc files/456/”
Id=123
Content=”abc files/123/ abc files/456/”
Could you please help me with the proper way to write this SQL statement?
Select id from tableA where content like …
Thanks,
Lance
On Fri, Nov 9, 2018 at 9:46 AM Campbell, Lance <lance@illinois.edu> wrote: > Use case: > I have a table I will call “tableA”. It has two columns. One column is an “id” that is an integer primary key. The secondfield is a text field called “content”. > In the “content” field there may or may not be this substring “/files/{id}/” . Note that {id} represents the field idfor that record. I want to find any rows where content contains “/files/###/” where the ### does not match the id forthe row. Also note that the id could be 1 to five digits. Also there could be good and bad substrings in the contentfield. > Example where a row is fine: > Id=123 > Content=”abc” > Id=345 > Content=”abc files/345/ abc files/345/” > Example where a rows are not good: > Id=123 > Content=”abc files/456/” > Id=123 > Content=”abc files/123/ abc files/456/” > Could you please help me with the proper way to write this SQL statement? > Select id from tableA where content like … Personally I'd use a regular expression: ... WHERE content ~ '/' || id || '/'; It not obvious why your fourth example is not good - the pattern above will see files/123/ and be happy. David J.
I am trying to find IDs where the field content contain improper substrings. I want something like this: SELECT id from tableA WHERE content LIKE "%files/" || {some number that does not match the ID} || "/%" I don't know how to write this. The SQL you provided is the opposite of what I was looking for. Thanks though. Lance On 11/9/18, 10:54 AM, "David G. Johnston" <david.g.johnston@gmail.com> wrote: On Fri, Nov 9, 2018 at 9:46 AM Campbell, Lance <lance@illinois.edu> wrote: > Use case: > I have a table I will call “tableA”. It has two columns. One column is an “id” that is an integer primary key. Thesecond field is a text field called “content”. > In the “content” field there may or may not be this substring “/files/{id}/” . Note that {id} represents the fieldid for that record. I want to find any rows where content contains “/files/###/” where the ### does not match the idfor the row. Also note that the id could be 1 to five digits. Also there could be good and bad substrings in the contentfield. > Example where a row is fine: > Id=123 > Content=”abc” > Id=345 > Content=”abc files/345/ abc files/345/” > Example where a rows are not good: > Id=123 > Content=”abc files/456/” > Id=123 > Content=”abc files/123/ abc files/456/” > Could you please help me with the proper way to write this SQL statement? > Select id from tableA where content like … Personally I'd use a regular expression: ... WHERE content ~ '/' || id || '/'; It not obvious why your fourth example is not good - the pattern above will see files/123/ and be happy. David J.
The convention on these lists is to bottom-post and trim unnecessary context. On Fri, Nov 9, 2018 at 10:08 AM Campbell, Lance <lance@illinois.edu> wrote: > The SQL you provided is the opposite of what I was looking for. Thanks though. See if these help you: SELECT * FROM regexp_matches('/123/ /456/', '/(?!123)(\d+)/', 'g'); SELECT * FROM regexp_matches('/123/', '/(?!123)(\d+)/', 'g'); SELECT * FROM regexp_matches('/456/', '/(?!123)(\d+)/', 'g'); Your main problem is the scenario where the desired string does exist but then others that are unexpected exist as well. Without that its as simple as locating ones with the correct string and then negating the result. You may find you want to use multiple tests and combine them with AND/OR. David J.
Very helpful. I am almost there. I created this SQL: SELECT regexp_matches(content, '/(?!files/'||id||'/)(files/\d+/)/', 'g') FROM tablea I get no matches. My guess is I am close but slightly off on the syntax. Lance On 11/9/18, 11:22 AM, "David G. Johnston" <david.g.johnston@gmail.com> wrote: The convention on these lists is to bottom-post and trim unnecessary context. On Fri, Nov 9, 2018 at 10:08 AM Campbell, Lance <lance@illinois.edu> wrote: > The SQL you provided is the opposite of what I was looking for. Thanks though. See if these help you: SELECT * FROM regexp_matches('/123/ /456/', '/(?!123)(\d+)/', 'g'); SELECT * FROM regexp_matches('/123/', '/(?!123)(\d+)/', 'g'); SELECT * FROM regexp_matches('/456/', '/(?!123)(\d+)/', 'g'); Your main problem is the scenario where the desired string does exist but then others that are unexpected exist as well. Without that its as simple as locating ones with the correct string and then negating the result. You may find you want to use multiple tests and combine them with AND/OR. David J.
On Fri, Nov 9, 2018 at 10:59 AM Campbell, Lance <lance@illinois.edu> wrote: > SELECT regexp_matches(content, '/(?!files/'||id||'/)(files/\d+/)/', 'g') FROM tablea > I get no matches. My guess is I am close but slightly off on the syntax. You have two forward slashes in a row at the end of the pattern: ( # capture files/\d+/ #pattern, digits followed by slash, lead by files/ ) # close capture / # another bracket - unlike Perl the // are not syntax but pattern David J.
>>>>> "Campbell" == Campbell, Lance <lance@illinois.edu> writes: Campbell> Very helpful. I am almost there. Campbell> I created this SQL: Campbell> SELECT regexp_matches(content, '/(?!files/'||id||'/)(files/\d+/)/', 'g') FROM tablea Campbell> I get no matches. My guess is I am close but slightly off on Campbell> the syntax. Simplest regexp solution is to do this: SELECT ... WHERE content ~ ('files/(?!' ||id|| '/)\d+/') i.e. we're generating a regexp like 'files/(?!123/)\d+/' for each row. No need for regexp_matches in this case because all we're looking for is whether a match exists. Another, possibly faster because it doesn't need a regexp compile for each row, but possibly slower due to subplan overhead, would be: SELECT ... WHERE id::text <> ANY (SELECT (regexp_matches(content, 'files/(\d+)/', 'g'))[1]) The idea of the second method is to extract all the "NNN" values from files/NNN/ substrings, and then test whether any NNN value is different from the expected one. (This is a VERY RARE use of "<> ANY"; normally one uses "<> ALL" as the negation of "= ANY", but the logic here requires the negation of "= ALL" instead.) -- Andrew (irc:RhodiumToad)