Re: [PATCH] kNN for btree

Поиск
Список
Период
Сортировка
От Nikita Glukhov
Тема Re: [PATCH] kNN for btree
Дата
Msg-id 146f4c47-cc52-a06d-343f-d152a20524e3@postgrespro.ru
обсуждение исходный текст
Ответ на Re: [PATCH] kNN for btree  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
Ответы Re: [PATCH] kNN for btree  (Michael Paquier <michael@paquier.xyz>)
Список pgsql-hackers

Attached 5th version of the patches.

On 11.01.2019 2:21, Alexander Korotkov wrote:

Hi!

I've couple more notes regarding this patch.
1) There are two loops over scan key determining scan strategy:
existing loop in _bt_first(), and in new function
_bt_select_knn_search_strategy().  It's kind of redundant that we've
to process scan keys twice for knn searches.  I think scan keys
processing should be merged into one loop.
This redundant loop was removed and code from _bt_select_knn_search_strategy()
was moved into the new function _bt_emit_scan_key() extracted from
_bt_preprocess_keys().
2) We're not supporting knn ordering only using the first key.
Supporting multiple knn keys would require significant reword of
B-tree traversal algorithm making it closer to GiST and SP-GiST.  So,
I think supporting only one knn key is reasonable restriction, at
least for now.  But we could support single-key knn ordering in more
cases.  For instance, knn search in "SELECT * FROM tbl WHERE a =
const1 ORDER BY b <-> const2" could benefit from (a, b) B-tree index.
So, we can support knn search on n-th key if there are equality scan
keys for [1, n-1] index keys.
I will try to implement this in the next version of the patch.

I also note that you've removed implementation of distance functions
from btree_gist.  But during pg_upgrade extensions are moved "as is".
Not just CREATE EXTENSION command is dumped, but the whole extension
content.  pg_upgrade'd instances would have old version of extension
metadata with new .so until ALTER EXTENSION UPDATE. So, user would get
errors about missed function in .so until updates the extension.

We're typically evade this by inclusion of old functions into new .so.
Then user can work normally before extension update.  In this
particular case, we can leave the distance functions in the .so, but
make them just wrappers over core functions.

Wrappers over core functions were left in btree_gist.

I've run regression tests with patch applied and opr_sanity showed some errors:

1) date_dist_timestamptz(), timestamp_dist_timestamptz(),
timestamptz_dist_date(), timestamptz_dist_timestamp() should be
stable, not immutable.  These functions use timezone during
conversion.

Fixed.
2) date_dist_timestamp(), date_dist_timestamptz(),
timestamp_dist_date(), timestamp_dist_timestamptz(),
timestamptz_dist_date(), timestamptz_dist_timestamp() should be not
leafproof.  These functions perform conversion, which might fail in
corner case.  So, this error should be considered as a leak.
All new distance functions except oiddist() are not leakproof,
so I had to relax condition in opr_sanity.sql test:

- pp.proleakproof != po.proleakproof
+ (NOT pp.proleakproof AND po.proleakproof))


--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Langote
Дата:
Сообщение: Re: speeding up planning with partitions
Следующее
От: Peter Eisentraut
Дата:
Сообщение: Re: port of INSTALL file generation to XSLT