KNN-GiST with recheck

Поиск

Список

Период

Сортировка

От	Alexander Korotkov
Тема	KNN-GiST with recheck
Дата	13 января 2014 г. 20:17:38
Msg-id	CAPpHfdu_qBLNRnv-r=_tofZYYa6-r=Z5_MGF4_phAOkWcYxfRg@mail.gmail.com обсуждение исходный текст
Ответы	Re: KNN-GiST with recheck (Heikki Linnakangas <hlinnakangas@vmware.com>) Re: KNN-GiST with recheck (Peter Geoghegan <pg@heroku.com>)
Список	pgsql-hackers

Дерево обсуждения

Hackers!

This patch was split from thread:

http://www.postgresql.org/message-id/CAPpHfdscOX5an71nHd8WSUH6GNOCf=V7wgDaTXdDd9=goN-gfA@mail.gmail.com

I've split it to separate thead, because it's related to partial sort only conceptually not technically. Also I renamed it to "knn-gist-recheck" from "partial-knn" as more appropriate name. In the attached version docs are updated. Possible weak point of this patch design is that it fetches heap tuple from GiST scan. However, I didn't receive any notes about its design, so, I'm going to put it to commitfest.

Here goes a desription of this patch same as in original thread.

KNN-GiST provides ability to get ordered results from index, but this order is based only on index information. For instance, GiST index contains bounding rectangles for polygons, and we can't get exact distance to polygon from index (similar situation is in PostGIS). In attached patch, GiST distance method can set recheck flag (similar to consistent method). This flag means that distance method returned lower bound of distance and we should recheck it from heap.

See an example.

create table test as (select id, polygon(3+(random()*10)::int, circle(point(random(), random()), 0.0003 + random()*0.001)) as p from generate_series(1,1000000) id);

create index test_idx on test using gist (p);

We can get results ordered by distance from polygon to point.

postgres=# select id, p <-> point(0.5,0.5) from test order by p <-> point(0.5,0.5) limit 10;

id | ?column?

--------+----------------------

755611 | 0.000405855808916853

807562 | 0.000464123777564343

437778 | 0.000738524708741959

947860 | 0.00076250998760724

389843 | 0.000886362723569568

17586 | 0.000981960100555216

411329 | 0.00145338112316853

894191 | 0.00149399559703506

391907 | 0.0016647896049741

235381 | 0.00167554614889509

(10 rows)

It's fast using just index scan.

QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------

Limit (cost=0.29..1.86 rows=10 width=36) (actual time=0.180..0.230 rows=10 loops=1)

-> Index Scan using test_idx on test (cost=0.29..157672.29 rows=1000000 width=36) (actual time=0.179..0.228 rows=10 loops=1)

Order By: (p <-> '(0.5,0.5)'::point)

Total runtime: 0.305 ms

(4 rows)

------
With best regards,
Alexander Korotkov.

Вложения

knn-gist-recheck-1.patch.gz

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Alexander Korotkov
Дата: 13 января 2014 г., 20:07:50
Сообщение: Re: GIN improvements part 1: additional information

Следующее

От: Andrew Dunstan
Дата: 13 января 2014 г., 20:30:34
Сообщение: Re: nested hstore patch

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

KNN-GiST with recheck

Вложения

Предыдущее

Следующее