Re: similarity and operator '%'

Поиск
Список
Период
Сортировка
От David G. Johnston
Тема Re: similarity and operator '%'
Дата
Msg-id CAKFQuwaEUYS75qJVGQZJ7FGDZtM+kMrzTQzdPNLFhxVk+vQkDg@mail.gmail.com
обсуждение исходный текст
Ответ на similarity and operator '%'  (Volker Boehm <volker@vboehm.de>)
Список pgsql-performance
On Mon, May 30, 2016 at 1:53 PM, Volker Boehm <volker@vboehm.de> wrote:

The reason for using the similarity function in place of the '%'-operator is that I want to use different similarity values in one query:

    select name, street, zip, city
    from addresses
    where name % $1
        and street % $2
        and (zip % $3 or city % $4)
        or similarity(name, $1) > 0.8

which means: take all addresses where name, street, zip and city have little similarity _plus_ all addresses where the name matches very good.


The only way I found, was to create a temporary table from the first query, change the similarity value with set_limit() and then select the second query UNION the temporary table.

Is there a more elegant and straight forward way to achieve this result?

​Not that I can envision.

You are forced into using an operator due to our index implementation.

You are thus forced into using a GUC to control the parameter that the index scanning function uses to compute true/false.

A GUC can only take on a single value within a given query - well, not quite true[1] but the exception doesn't seem like it will help here.

Th
us you are consigned to​
 
​using two queries.

*​A functional index​ doesn't work since the second argument is query specific

[1]​ When defining a function you can attach a "SET" clause to it; commonly used for search_path but should work with any GUC.  If you could wrap the operator comparison into a custom function you could use this capability.  It also would require a function that would take the threshold as a value - the extension only provides variations that use the GUC.

I don't think this will use the index even if it compiles (not tested):

CREATE FUNCTION similarity_80(col, val)
RETURNS boolean
SET similarity_threshold = 0.80
LANGUAGE sql
AS $$
​SELECT ​col % val;
$$;

​David J.​

В списке pgsql-performance по дате отправления:

Предыдущее
От: Volker Boehm
Дата:
Сообщение: similarity and operator '%'
Следующее
От: Jeff Janes
Дата:
Сообщение: Re: Re: Planner chooses slow index heap scan despite accurate row estimates