Обсуждение: pgsql: Allow GiST distance function to return merely a lower-bound.

Поиск
Список
Период
Сортировка

pgsql: Allow GiST distance function to return merely a lower-bound.

От
Heikki Linnakangas
Дата:
Allow GiST distance function to return merely a lower-bound.

The distance function can now set *recheck = false, like index quals. The
executor will then re-check the ORDER BY expressions, and use a queue to
reorder the results on the fly.

This makes it possible to do kNN-searches on polygons and circles, which
don't store the exact value in the index, but just a bounding box.

Alexander Korotkov and me

Branch
------
master

Details
-------
http://git.postgresql.org/pg/commitdiff/35fcb1b3d038a501f3f4c87c05630095abaaadab

Modified Files
--------------
doc/src/sgml/gist.sgml                     |   35 ++-
src/backend/access/gist/gistget.c          |   30 ++-
src/backend/access/gist/gistproc.c         |   37 +++
src/backend/access/gist/gistscan.c         |    5 +
src/backend/executor/nodeIndexscan.c       |  379 +++++++++++++++++++++++++++-
src/backend/optimizer/plan/createplan.c    |   73 ++++--
src/backend/utils/adt/geo_ops.c            |   27 ++
src/include/access/genam.h                 |    3 +
src/include/access/relscan.h               |    9 +
src/include/catalog/catversion.h           |    2 +-
src/include/catalog/pg_amop.h              |    2 +
src/include/catalog/pg_amproc.h            |    2 +
src/include/catalog/pg_operator.h          |    8 +-
src/include/catalog/pg_proc.h              |    4 +
src/include/nodes/execnodes.h              |   20 ++
src/include/nodes/plannodes.h              |   10 +-
src/include/utils/geo_decls.h              |    3 +
src/test/regress/expected/create_index.out |   78 ++++++
src/test/regress/sql/create_index.sql      |   12 +
19 files changed, 699 insertions(+), 40 deletions(-)


Re: pgsql: Allow GiST distance function to return merely a lower-bound.

От
Fujii Masao
Дата:
On Fri, May 15, 2015 at 8:27 PM, Heikki Linnakangas
<heikki.linnakangas@iki.fi> wrote:
> Allow GiST distance function to return merely a lower-bound.
>
> The distance function can now set *recheck = false, like index quals. The
> executor will then re-check the ORDER BY expressions, and use a queue to
> reorder the results on the fly.
>
> This makes it possible to do kNN-searches on polygons and circles, which
> don't store the exact value in the index, but just a bounding box.
>
> Alexander Korotkov and me
>
> Branch
> ------
> master
>
> Details
> -------
> http://git.postgresql.org/pg/commitdiff/35fcb1b3d038a501f3f4c87c05630095abaaadab
>
> Modified Files
> --------------
> doc/src/sgml/gist.sgml                     |   35 ++-
> src/backend/access/gist/gistget.c          |   30 ++-
> src/backend/access/gist/gistproc.c         |   37 +++
> src/backend/access/gist/gistscan.c         |    5 +
> src/backend/executor/nodeIndexscan.c       |  379 +++++++++++++++++++++++++++-
> src/backend/optimizer/plan/createplan.c    |   73 ++++--
> src/backend/utils/adt/geo_ops.c            |   27 ++
> src/include/access/genam.h                 |    3 +
> src/include/access/relscan.h               |    9 +
> src/include/catalog/catversion.h           |    2 +-
> src/include/catalog/pg_amop.h              |    2 +
> src/include/catalog/pg_amproc.h            |    2 +
> src/include/catalog/pg_operator.h          |    8 +-
> src/include/catalog/pg_proc.h              |    4 +
> src/include/nodes/execnodes.h              |   20 ++
> src/include/nodes/plannodes.h              |   10 +-
> src/include/utils/geo_decls.h              |    3 +
> src/test/regress/expected/create_index.out |   78 ++++++
> src/test/regress/sql/create_index.sql      |   12 +
> 19 files changed, 699 insertions(+), 40 deletions(-)

Seems this patch causes the regression test of pg_trgm fail.
The regression diff that I got is:

*** /home/postgres/pgsql/head/contrib/pg_trgm/expected/pg_trgm.out
2013-07-23 16:46:22.212488785 +0900
--- /home/postgres/pgsql/head/contrib/pg_trgm/results/pg_trgm.out
2015-05-15 20:59:16.574926732 +0900
***************
*** 2332,2343 ****
  (3 rows)

  select t <-> 'q0987wertyu0988', t from test_trgm order by t <->
'q0987wertyu0988' limit 2;
!  ?column? |      t
! ----------+-------------
!  0.411765 | qwertyu0988
!       0.5 | qwertyu0987
! (2 rows)
!
  drop index trgm_idx;
  create index trgm_idx on test_trgm using gin (t gin_trgm_ops);
  set enable_seqscan=off;
--- 2332,2338 ----
  (3 rows)

  select t <-> 'q0987wertyu0988', t from test_trgm order by t <->
'q0987wertyu0988' limit 2;
! ERROR:  index returned tuples in wrong order
  drop index trgm_idx;
  create index trgm_idx on test_trgm using gin (t gin_trgm_ops);
  set enable_seqscan=off;

Regards,

--
Fujii Masao


Re: pgsql: Allow GiST distance function to return merely a lower-bound.

От
Heikki Linnakangas
Дата:
On 05/15/2015 03:05 PM, Fujii Masao wrote:
> Seems this patch causes the regression test of pg_trgm fail.
> The regression diff that I got is:
>
> *** /home/postgres/pgsql/head/contrib/pg_trgm/expected/pg_trgm.out
> 2013-07-23 16:46:22.212488785 +0900
> --- /home/postgres/pgsql/head/contrib/pg_trgm/results/pg_trgm.out
> 2015-05-15 20:59:16.574926732 +0900
> ***************
> *** 2332,2343 ****
>    (3 rows)
>
>    select t <-> 'q0987wertyu0988', t from test_trgm order by t <->
> 'q0987wertyu0988' limit 2;
> !  ?column? |      t
> ! ----------+-------------
> !  0.411765 | qwertyu0988
> !       0.5 | qwertyu0987
> ! (2 rows)
> !
>    drop index trgm_idx;
>    create index trgm_idx on test_trgm using gin (t gin_trgm_ops);
>    set enable_seqscan=off;
> --- 2332,2338 ----
>    (3 rows)
>
>    select t <-> 'q0987wertyu0988', t from test_trgm order by t <->
> 'q0987wertyu0988' limit 2;
> ! ERROR:  index returned tuples in wrong order
>    drop index trgm_idx;
>    create index trgm_idx on test_trgm using gin (t gin_trgm_ops);
>    set enable_seqscan=off;

Hmm, OK. pg_trgm works for me, but I'll take a look. (rover_firefly also
went red, due to rounding differences in the regression test)

- Heikki