Re: CUBE_MAX_DIM

Поиск
Список
Период
Сортировка
От Darafei "Komяpa" Praliaskouski
Тема Re: CUBE_MAX_DIM
Дата
Msg-id CAC8Q8tLA8jO5nj20CqUXe+m4HqSrBLoRS2aFsuQKdXpmhJh5OQ@mail.gmail.com
обсуждение исходный текст
Ответ на CUBE_MAX_DIM  (Devrim Gündüz <devrim@gunduz.org>)
Список pgsql-hackers
Hello,

The problem with higher dimension cubes is that starting with dimensionality of ~52 the "distance" metrics in 64-bit float have less than a single bit per dimension in mantissa, making cubes indistinguishable. Developers for facial recognition software had a chat about that on russian postgres telegram group https://t.me/pgsql. Their problem was that they had 128-dimensional points, recompiled postgres - distances weren't helpful, and GIST KNN severely degraded to almost full scans. They had to change the number of facial features to smaller in order to make KNN search work.

Floating point overflow isn't that much of a risk per se, worst case scenario it becomes an Infinity or 0 which are usually acceptable in those contexts.

While mathematically possible, there are implementation issues with higher dimension cubes. I'm ok with raising the limit if such nuances get a mention in docs.

On Thu, Jun 25, 2020 at 1:01 PM Devrim Gündüz <devrim@gunduz.org> wrote:

Hi,

Someone contacted me about increasing CUBE_MAX_DIM
in contrib/cube/cubedata.h (in the community RPMs). The current value
is 100 with the following comment:

* This limit is pretty arbitrary, but don't make it so large that you
* risk overflow in sizing calculations.


They said they use 500, and never had a problem. I never added such patches to the RPMS, and will not -- but wanted to ask if we can safely increase it in upstream?

Regards,

--
Devrim Gündüz
Open Source Solution Architect, Red Hat Certified Engineer
Twitter: @DevrimGunduz , @DevrimGunduzTR


--
Darafei Praliaskouski

В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Inoue, Hiroshi"
Дата:
Сообщение: Re: Removal of currtid()/currtid2() and some table AM cleanup
Следующее
От: Dilip Kumar
Дата:
Сообщение: Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions