Обсуждение: pg_trgm module: no convertion into Trigrams on one side when comparing
Hi,
I'm using the pg_trgm module,
Is there a possibility not to convert one side of the string when comparing?
I need a kind of reference string containing only certain trigrams.
It's not possible to build every kind of string using the function.
example:
Code: Alles auswählen
SELECT show_trgm('abc');
show_trgm
-------------------------
{" a"," ab",abc,"bc "}
(1 row)
But I like to have only {'abc'} for example.
I would use this function: "similarity(text, text)"
Could it somehow work with explicit data-type definitions or maybe with putting the string in brackets or quoting?
Or do I have to change the source code?
Thanks for your help
Janek Sendrowski
On Thu, Nov 14, 2013 at 2:37 AM, Janek Sendrowski <janek12@web.de> wrote:
> Hi,
> I'm using the pg_trgm module,
> Is there a possibility not to convert one side of the string when comparing?
> I need a kind of reference string containing only certain trigrams.
> It's not possible to build every kind of string using the function.
>
> example:
> Code: Alles auswählen
> SELECT show_trgm('abc');
> show_trgm
> -------------------------
> {" a"," ab",abc,"bc "}
> (1 row)
>
> But I like to have only {'abc'} for example.
>
> I would use this function: "similarity(text, text)"
> Could it somehow work with explicit data-type definitions or maybe with putting the string in brackets or quoting?
> Or do I have to change the source code?
>
I guess you're looking for simple pattern matching; something like:
column LIKE '%abc%'
postgres=# create table foo(a text);
CREATE TABLE
postgres=# insert into foo values ('ab'), ('abc'), ('gabcd'), ('xabf');
INSERT 0 4
-- similarity() threshold
postgres=# select show_limit();
show_limit
------------
0.3
(1 row)
-- '%' is similarity operator which returns true if column value is
"sufficiently similar" to key (in this case 'abc'). This is determined
by the number of tri-grams two strings share.
postgres=# select *,similarity(a, 'abc') from foo where a % 'abc';
a | similarity
-----+------------
ab | 0.4
abc | 1
(2 rows)
-- And finally, probably what you're looking for. Simple pattern matching.
postgres=# select * from foo where a LIKE '%abc%';
a
-------
abc
gabcd
(2 rows)
-- You could go ahead and add more trigrams that you'd want result to contain.
postgres=# insert into foo values ('gabcddfg');
INSERT 0 4
postgres=# select * from foo where a LIKE '%abc%dfg%';
a
----------
gabcddfg
(1 row)
Is this what you want?
--
Amit
Re: pg_trgm module: no convertion into Trigrams on one side when comparing
От
"Janek Sendrowski"
Дата:
Hi Amit,
Thanks for your answer.
My issue is that I still need a ranking like the similarity when comparing trigrams.
I'm working on a similarity search, which determindes similiar sentences from my table.
The usual similarity search with Trigrams and a gist bzw. gin index is to slow with my set of rows.
So I've built an Fixed Query Array. I took some sentences as pivots and calculated the distance between them and all my
sentences.
Now I can search for similiar sentences by calculating their distance to my pivots and comparing them with the
distancesI already got.
I noticed that it's a good way to use trigrams as metric distance function.
So the issue is that I have to take certain trigrams as pivots and of cource I need to store an exact ranking when
comparing.
Your method detect the rows which includes all of the trigrams you are searching for, but I still need a ranking...
something like SELECT similarity('How are your', "{" ho","how","are","re "," hi","wha","hat","at "," a"}"::trgm);
I hope you have an idea
Janek Sendrowski
On Thu, Nov 14, 2013 at 9:05 PM, Janek Sendrowski <janek12@web.de> wrote:
> Hi Amit,
>
> Thanks for your answer.
>
> My issue is that I still need a ranking like the similarity when comparing trigrams.
> I'm working on a similarity search, which determindes similiar sentences from my table.
>
> The usual similarity search with Trigrams and a gist bzw. gin index is to slow with my set of rows.
>
> So I've built an Fixed Query Array. I took some sentences as pivots and calculated the distance between them and all
mysentences.
> Now I can search for similiar sentences by calculating their distance to my pivots and comparing them with the
distancesI already got.
> I noticed that it's a good way to use trigrams as metric distance function.
>
> So the issue is that I have to take certain trigrams as pivots and of cource I need to store an exact ranking when
comparing.
> Your method detect the rows which includes all of the trigrams you are searching for, but I still need a ranking...
>
> something like SELECT similarity('How are your', "{" ho","how","are","re "," hi","wha","hat","at "," a"}"::trgm);
>
> I hope you have an idea
If I understand you correctly, you want to find a similarity value
based NOT on set of tri-grams internally produced by pg_trgm but a
custom set of tri-grams you have chosen. So, you are looking for a way
to provide pg_trgm a custom set of tri-grams to be used for similarity
calculation.
If that is the case, I am not sure if there is any way/interface
currently to do that.
--
Amit
Re: pg_trgm module: no convertion into Trigrams on one side when comparing
От
"Janek Sendrowski"
Дата:
Hi Amit, Do you think it would by difficult to edit the source and wirte a new function, wich does it? Janek Sendrowski