Re: Bug in brin_minmax_multi_distance_numeric()
От | Peter Eisentraut |
---|---|
Тема | Re: Bug in brin_minmax_multi_distance_numeric() |
Дата | |
Msg-id | d48b2a78-7ce0-46d8-b096-2ff2331aa819@eisentraut.org обсуждение исходный текст |
Ответ на | Re: Bug in brin_minmax_multi_distance_numeric() (Tomas Vondra <tomas@vondra.me>) |
Ответы |
Re: Bug in brin_minmax_multi_distance_numeric()
|
Список | pgsql-hackers |
On 01.08.25 19:17, Tomas Vondra wrote: > > > On 7/31/25 20:35, Tom Lane wrote: >> Tomas Vondra <tomas@vondra.me> writes: >>> On 7/31/25 19:33, Tom Lane wrote: >>>> ... It is certainly broken on >>>> 32-bit machines where the Datum result of numeric_float8 will >>>> be a pointer, so that we will convert the numeric pointer value >>>> to a float and return that, yielding a totally-garbage distance >>>> value. But I think it's broken on 64-bit machines too, because >>>> we'll be interpreting the output of numeric_float8 as a uintptr_t >>>> and applying some unwanted conversion to make that a float. >> >>> Agreed it's a bug on 32-bit machines. Not sure about 64-bits. >> >> Yeah, I'm not 100% sure about that. It's certainly doing something >> unexpected, but we might accidentally end up with relatively-sane >> relative distance comparisons anyway. (I assume the outputs will >> only be compared to other outputs of the same function, right?) > > > Yes. The index accumulates values, sorts them, calculates distance > between the points, and them "merges" the closest ones. So it only > compares results of the same function. > >> I have a vague recollection that the IEEE float format was chosen with >> an eye to making comparisons cheap, ie not too much different from >> integer comparisons. So the sort order might be about the same >> even after incorrectly reinterpreting the bit-pattern as an int. >> NaNs probably mess that up, but they would anyway. >> >>> Actually, no - it should not cause "broken" indexes, as in "giving >>> incorrect results". The distance functions determine in what order we >>> merge points into ranges, and if the distances are bogus then we can >>> build a summary that is less efficient. >> >> Got it. So it might be worth reindexing such indexes after the >> fix, but it's not strictly necessary. Do we want to make a separate commit for this issue that can be backpatched and have some user-facing information attached to it?
В списке pgsql-hackers по дате отправления: