Обсуждение: GiST: interpretation of NaN from penalty function

Поиск
Список
Период
Сортировка

GiST: interpretation of NaN from penalty function

От
Andrew Borodin
Дата:
Hi hackers!

Currently GiST treats NaN penalty as zero penalty, in terms of
generalized tree this means "perfect fit". I think that this situation
should be considered "worst fit" instead.
Here is a patch to highlight place in the code.
I could not construct test to generate bad tree, which would be fixed
by this patch. There is not so much of cases when you get NaN. None of
them can be a result of usual additions and multiplications of real
values.

Do I miss something? Is there any case when NaN should be considered good fit?

Greg Stark was talking about this in
BANLkTi=d+bPpS1cM4YC8KuKHj63Hwj4LMA@mail.gmail.com but that topic
didn't go far (due to triangles). I'm currently messing with floats in
penalties, very close to NaNs, and I think this question can be
settled.

Regrads, Andrey Borodin.

Вложения

Re: GiST: interpretation of NaN from penalty function

От
Tom Lane
Дата:
Andrew Borodin <borodin@octonica.com> writes:
> Currently GiST treats NaN penalty as zero penalty, in terms of
> generalized tree this means "perfect fit". I think that this situation
> should be considered "worst fit" instead.

On what basis?  It seems hard to me to make any principled argument here.
Certainly, "NaN means infinity", as your patch proposes, has no more basis
to it than "NaN means zero".  If the penalty function doesn't like that
interpretation, it shouldn't return NaN.
        regards, tom lane



Re: GiST: interpretation of NaN from penalty function

От
Andrew Borodin
Дата:
> Certainly, "NaN means infinity", as your patch proposes, has no more basis to it than "NaN means zero".
You are absolutley right. Now I see that best interpretation is "NaN
means NaN". Seems like we need only drop a check. Nodes with NaN
penalties will be considered even worser than those with infinity
penalty.
Penalty calculation is CPU performance critical, it is called for
every tuple on page along insertion path. Ommiting this check will
speed this up...a tiny bit.
> If the penalty function doesn't like that interpretation, it shouldn't return NaN.
It may return NaN accidentally. If NaN will pass through union()
function then index will be poisoned.
That's not a good contract to remember for extension developer.


Regards, Andrey Borodin.