Re: [HACKERS] [PATCH] kNN for SP-GiST

Поиск
Список
Период
Сортировка
От Nikita Glukhov
Тема Re: [HACKERS] [PATCH] kNN for SP-GiST
Дата
Msg-id c6957e2c-3052-4fce-55ac-2cf7bcfd0dd9@postgrespro.ru
обсуждение исходный текст
Ответ на Re: [HACKERS] [PATCH] kNN for SP-GiST  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
Ответы Re: [HACKERS] [PATCH] kNN for SP-GiST  (Andrey Borodin <x4mmm@yandex-team.ru>)
Список pgsql-hackers

Attached 5th version of the patches, where minor refactoring of distance
handling was done (see below).

On 02.07.2018 19:06, Alexander Korotkov wrote:

Hi!

On Fri, Jun 29, 2018 at 5:37 PM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:
On 06.03.2018 17:30, David Steele wrote:

I agree with Andres.  Pushing this patch to the next CF.
Attached 4th version of the patches rebased onto the current master.
Nothing interesting has changed from the previous version.
I took a look to this patchset.  In general, it looks good for me, but
I've following notes about it.

* We're going to replace scan stack with pairing heap not only for
KNN-search, but for regular search as well.  Did you verify that it
doesn't cause regression for regular search case, because insertion
into pairing heap might be slower than insertion into stack?  One may
say that it didn't caused regression in GiST, and that's why it
shouldn't cause regression in SP-GiST.  However, SP-GiST trees might
be much higher and these performance aspects might be different.
I decided to bring back the List-based scan stack in the previous version 
of the patches, and here is how spgAddSearchItemToQueue() looks like now:

static void
spgAddSearchItemToQueue(SpGistScanOpaque so, SpGistSearchItem *item)
{   if (so->queue)       pairingheap_add(so->queue, &item->phNode);   else       so->scanStack = lcons(item, so->scanStack);
}

so->queue is initialized in spgrescan() only for ordered searches.

* I think null handling requires better explanation. Nulls are
specially handled in pairingheap_SpGistSearchItem_cmp().  In the same
time you explicitly set infinity distances for leaf nulls.  You
probably have reasons to implement it this way, but I think this
should be explained.  Also isnull property of SpGistSearchItem doesn't
have comment.
Distances for NULL items are expected to be NULL (it would not be true for
non-strict the distance operators, so we do not seem to support them here),
and so NULL items are considered to be greater than any non-NULL items in
pairingheap_SpGistSearchItem_cmp().  Distances are copied into SpGistSearchItem 
only if it is non-NULL item.  Also in the last version of the patch I have
introduced spgAllocSearchItem() which conditionally allocates distance-array in
SpGistSearchItem.  Now inifinity distances are used only in one place, but
if we require that innerConsistent() should always return distances, then
we can completely get rid of so->infDistances field.   
* I think KNN support should be briefly described in
src/backed/access/spgist/README.
A minimal description written by the original author Vlad Sterzhanov
already present in README.

-- 
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: Non-reserved replication slots and slot advancing
Следующее
От: Srinivas Karthik V
Дата:
Сообщение: Re: Bulk Insert into PostgreSQL