Обсуждение: Help with a not match

Поиск
Список
Период
Сортировка

Help with a not match

От
"Campbell, Lance"
Дата:

Use case:

I have a table I will call “tableA”.  It has two columns.  One column is an “id” that is an integer primary key.  The second field is a text field called “content”.

 

In the “content” field there may or may not be this substring “/files/{id}/” .  Note that {id} represents the field id for that record.  I want to find any rows where content contains “/files/###/” where the ### does not match the id for the row.  Also note that the id could be 1 to five digits.  Also there could be good and bad substrings in the content field.

 

Example where a row is fine:

Id=123

Content=”abc”

 

Id=345

Content=”abc files/345/ abc files/345/”

 

Example where a rows are not good:

Id=123

Content=”abc files/456/”

 

Id=123

Content=”abc files/123/ abc files/456/”

 

 

Could you please help me with the proper way to write this SQL statement?

 

Select id from tableA where content like …

 

 

Thanks,

 

Lance

Re: Help with a not match

От
"David G. Johnston"
Дата:
On Fri, Nov 9, 2018 at 9:46 AM Campbell, Lance <lance@illinois.edu> wrote:
> Use case:
> I have a table I will call “tableA”.  It has two columns.  One column is an “id” that is an integer primary key.  The
secondfield is a text field called “content”. 
> In the “content” field there may or may not be this substring “/files/{id}/” .  Note that {id} represents the field
idfor that record.  I want to find any rows where content contains “/files/###/” where the ### does not match the id
forthe row.  Also note that the id could be 1 to five digits.  Also there could be good and bad substrings in the
contentfield. 
> Example where a row is fine:
> Id=123
> Content=”abc”
> Id=345
> Content=”abc files/345/ abc files/345/”
> Example where a rows are not good:
> Id=123
> Content=”abc files/456/”
> Id=123
> Content=”abc files/123/ abc files/456/”
> Could you please help me with the proper way to write this SQL statement?
> Select id from tableA where content like …

Personally I'd use a regular expression:

... WHERE content ~ '/' || id || '/';

It not obvious why your fourth example is not good - the pattern above
will see files/123/ and be happy.

David J.


Re: Help with a not match

От
"Campbell, Lance"
Дата:
I am trying to find IDs where the field content contain improper substrings.  

I want something like this:
SELECT id from tableA WHERE content LIKE "%files/" || {some number that does not match the ID} || "/%" 

I don't know how to write this.

The SQL you provided is the opposite of what I was looking for.  Thanks though.

Lance 

On 11/9/18, 10:54 AM, "David G. Johnston" <david.g.johnston@gmail.com> wrote:

    On Fri, Nov 9, 2018 at 9:46 AM Campbell, Lance <lance@illinois.edu> wrote:
    > Use case:
    > I have a table I will call “tableA”.  It has two columns.  One column is an “id” that is an integer primary key.
Thesecond field is a text field called “content”.
 
    > In the “content” field there may or may not be this substring “/files/{id}/” .  Note that {id} represents the
fieldid for that record.  I want to find any rows where content contains “/files/###/” where the ### does not match the
idfor the row.  Also note that the id could be 1 to five digits.  Also there could be good and bad substrings in the
contentfield.
 
    > Example where a row is fine:
    > Id=123
    > Content=”abc”
    > Id=345
    > Content=”abc files/345/ abc files/345/”
    > Example where a rows are not good:
    > Id=123
    > Content=”abc files/456/”
    > Id=123
    > Content=”abc files/123/ abc files/456/”
    > Could you please help me with the proper way to write this SQL statement?
    > Select id from tableA where content like …
    
    Personally I'd use a regular expression:
    
    ... WHERE content ~ '/' || id || '/';
    
    It not obvious why your fourth example is not good - the pattern above
    will see files/123/ and be happy.
    
    David J.
    


Re: Help with a not match

От
"David G. Johnston"
Дата:
The convention on these lists is to bottom-post and trim unnecessary context.

On Fri, Nov 9, 2018 at 10:08 AM Campbell, Lance <lance@illinois.edu> wrote:
> The SQL you provided is the opposite of what I was looking for.  Thanks though.

See if these help you:

SELECT * FROM regexp_matches('/123/ /456/', '/(?!123)(\d+)/', 'g');
SELECT * FROM regexp_matches('/123/', '/(?!123)(\d+)/', 'g');
SELECT * FROM regexp_matches('/456/', '/(?!123)(\d+)/', 'g');

Your main problem is the scenario where the desired string does exist
but then others that are unexpected exist as well.  Without that its
as simple as locating ones with the correct string and then negating
the result.  You may find you want to use multiple tests and combine
them with AND/OR.

David J.


Re: Help with a not match

От
"Campbell, Lance"
Дата:
Very helpful.  I am almost there.  

I created this SQL:

SELECT regexp_matches(content, '/(?!files/'||id||'/)(files/\d+/)/', 'g') FROM tablea

I get no matches.  My guess is I am close but slightly off on the syntax.

Lance

On 11/9/18, 11:22 AM, "David G. Johnston" <david.g.johnston@gmail.com> wrote:

    The convention on these lists is to bottom-post and trim unnecessary context.
    
    On Fri, Nov 9, 2018 at 10:08 AM Campbell, Lance <lance@illinois.edu> wrote:
    > The SQL you provided is the opposite of what I was looking for.  Thanks though.
    
    See if these help you:
    
    SELECT * FROM regexp_matches('/123/ /456/', '/(?!123)(\d+)/', 'g');
    SELECT * FROM regexp_matches('/123/', '/(?!123)(\d+)/', 'g');
    SELECT * FROM regexp_matches('/456/', '/(?!123)(\d+)/', 'g');
    
    Your main problem is the scenario where the desired string does exist
    but then others that are unexpected exist as well.  Without that its
    as simple as locating ones with the correct string and then negating
    the result.  You may find you want to use multiple tests and combine
    them with AND/OR.
    
    David J.
    


Re: Help with a not match

От
"David G. Johnston"
Дата:
On Fri, Nov 9, 2018 at 10:59 AM Campbell, Lance <lance@illinois.edu> wrote:

> SELECT regexp_matches(content, '/(?!files/'||id||'/)(files/\d+/)/', 'g') FROM tablea
> I get no matches.  My guess is I am close but slightly off on the syntax.

You have two forward slashes in a row at the end of the pattern:

( # capture
files/\d+/ #pattern, digits followed by slash, lead by files/
) # close capture
/ # another bracket - unlike Perl the // are not syntax but pattern

David J.


Re: Help with a not match

От
Andrew Gierth
Дата:
>>>>> "Campbell" == Campbell, Lance <lance@illinois.edu> writes:

 Campbell> Very helpful.  I am almost there.  
 Campbell> I created this SQL:

 Campbell> SELECT regexp_matches(content, '/(?!files/'||id||'/)(files/\d+/)/', 'g') FROM tablea

 Campbell> I get no matches. My guess is I am close but slightly off on
 Campbell> the syntax.

Simplest regexp solution is to do this:

SELECT ... WHERE content ~ ('files/(?!' ||id|| '/)\d+/')

i.e. we're generating a regexp like 'files/(?!123/)\d+/' for each row.
No need for regexp_matches in this case because all we're looking for is
whether a match exists.

Another, possibly faster because it doesn't need a regexp compile for
each row, but possibly slower due to subplan overhead, would be:

SELECT ...
 WHERE id::text <> ANY (SELECT (regexp_matches(content, 'files/(\d+)/', 'g'))[1])

The idea of the second method is to extract all the "NNN" values from
files/NNN/ substrings, and then test whether any NNN value is different
from the expected one. (This is a VERY RARE use of "<> ANY"; normally
one uses "<> ALL" as the negation of "= ANY", but the logic here
requires the negation of "= ALL" instead.)

-- 
Andrew (irc:RhodiumToad)