Re: [pg_trgm] Making similarity(?, ?) < ? use an index

Поиск
Список
Период
Сортировка
От Jeff Janes
Тема Re: [pg_trgm] Making similarity(?, ?) < ? use an index
Дата
Msg-id CAMkU=1wnEUvUG-6bZe+y8XomEM-i75LYnrsvn9=uzEbg3a7SOA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [pg_trgm] Making similarity(?, ?) < ? use an index  (Greg Navis <contact@gregnavis.com>)
Ответы Re: [pg_trgm] Making similarity(?, ?) < ? use an index  (Jeff Janes <jeff.janes@gmail.com>)
Список pgsql-general
On Thu, Jun 9, 2016 at 1:57 AM, Greg Navis <contact@gregnavis.com> wrote:
> Artur, no worries, I'm not writing any code ;-)
>
> I did the following:
>
> CREATE TYPE trgm_match AS (match TEXT, threshold NUMERIC);

I would probably use REAL, not NUMERIC.  But maybe there is good
reason to use NUMERIC.

> CREATE OR REPLACE FUNCTION trgm_check_match (string TEXT, match trgm_match)
>   RETURNS bool
>   AS 'SELECT match.match <-> string <= 1 - match.threshold'
>   LANGUAGE SQL;
> CREATE OPERATOR %(leftarg = text, rightarg = trgm_match,
> procedure=trgm_check_match);
>
> This allows me to write:
>
> SELECT ('Warsaw' % row('Warsw', 0.3)::trgm_match);
>
> I'm not sure how to make this operator use an index. It seems I need to
> create an operator class but I'm not sure how. This is how pg_trgm creates
> its operator class:

I think you should pick a new operator name, not try to reuse %.
Based on Tom's previous comment that forking is probably not a good
idea, you probably want the new operator to co-exist with the existing
one, so it needs a different name.  For example, I picked %% without
giving it a lot of thought for this example below.


>
> -- create the operator class for gist
> CREATE OPERATOR CLASS gist_trgm_ops
> FOR TYPE text USING gist
> AS
>         OPERATOR        1       % (text, text),
>         FUNCTION        1       gtrgm_consistent (internal, text, smallint,
> oid, internal),
>         FUNCTION        2       gtrgm_union (internal, internal),
>         FUNCTION        3       gtrgm_compress (internal),
>         FUNCTION        4       gtrgm_decompress (internal),
>         FUNCTION        5       gtrgm_penalty (internal, internal,
> internal),
>         FUNCTION        6       gtrgm_picksplit (internal, internal),
>         FUNCTION        7       gtrgm_same (gtrgm, gtrgm, internal),
>         STORAGE         gtrgm;
>
> Should my operator class mimic the one above?

Look down a few more stanzas in "contrib/pg_trgm/pg_trgm--1.3.sql",
where it keeps adding new operators and functions.  You will want to
add your own in that method.  All of those could be consolidated into
one CREATE OPERATOR CLASS statement, but you will eventually have to
implement both an upgrade script and an install-from-scratch script,
so that is why they are broken out this way, to make that easier.

For testing, I'd just add "OPERATOR 9 %% (text, trgm_match)" to the
above, then drop and recreate the extension.

Although I would start with gin rather than gist, both because I find
it more useful, and I am more familiar with it.  YMMV of course.

Once you do that, you will probably start getting errors from the
gtrgm_consistent C function (if not others in the list of functions
first) because it is being asked to evaluate a strategy it doesn't
understand.  So then the next step is to teach the C code how to deal
with it.

Cheers,

Jeff


В списке pgsql-general по дате отправления:

Предыдущее
От: Craig Ringer
Дата:
Сообщение: Re: [HACKERS] Online DW
Следующее
От: Rakesh Kumar
Дата:
Сообщение: What is the general opinion on use of tablespaces