Обсуждение: Box type equality
Hi. I've faced an issue with Box type comparison that exists almost for a five years. > create table t (b box); CREATE TABLE > select count(*), b from t group by b; ERROR: could not identify an equality operator for type box As mentioned in http://www.postgresql.org/message-id/Pine.LNX.4.64.1012040051500.12632@sn.sai.msu.ru That can be fixed by b-tree equality for boxes, but we need some decisions there. We can compare floats up to certain threshold or strictly, and box equality can be defined as coordinate equality or as equality of areas. In a case of threshold-based comparison we will not lose equality due to rounding errors, for example applying commutative functions in different order will preserve equality, but we can face non-transitive equalities, like box1 == box2, box2 == box3, but box1 != box3 and GROUP BY can produce strange results. With strict comparison equality istransitive, but we can lose that equality due to rounding. Now in geo_decls.h we have: #define EPSILON 1.0E-06 #define FPeq(A,B) (fabs((A) - (B)) <= EPSILON) and equality means equal areas. But for GROUP BY it better be opposite: equality of coordinates and strict comparison. Simplest workaround in my perspective is to add another operator for box type (e.g. ==) that will perform strict comparison of coordinates and use it in b-tree ops. Any objections/suggestions? ----------------- Stas Kelvich Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Stanislav Kelvich <s.kelvich@postgrespro.ru> writes: > I've faced an issue with Box type comparison that exists almost for a five years. Try twenty-five years. The code's been like that since Berkeley. > That can be fixed by b-tree equality for boxes, but we need some > decisions there. The problem with inventing a btree opclass for boxes is much more fundamental than fuzzy comparisons, unfortunately. Btree requires a linear sort order, and there's no plausible linear ordering of boxes, unless you compare areas which won't give the equality semantics you want. We could perhaps invent an exact-equality operator and construct just a hash opclass for it, no btree. In any case I think it would be a mistake to consider only boxes; all the built-in geometric types have related issues. regards, tom lane
The Box type is the oldest non-linear type in the Postgres system. We used it as the template for extensibility in the original system about thirty years ago. We had R-trees for box indexing. If you want fuzzy box matching, that seems possible with R-trees and some creativity by say matching an R-tree with a little larger box and using containment and maybe also not contained by a smaller box. This is the idea behind strategies. That you can use existing operations to build a new operation. If you have to force this onto B-tree's I think you will have to choose one edge to index on (i.e. one of the four values) then fuzzy match that through the index and have a secondary condition to further restrict the matches. As with all the geometric types, you can use containment boxes for them and have the secondary condition checks. It's all just a few lines of code as Stonebraker used to say. Jeff Anton On 09/29/15 08:43, Tom Lane wrote: > Stanislav Kelvich <s.kelvich@postgrespro.ru> writes: >> I've faced an issue with Box type comparison that exists almost for a five years. > > Try twenty-five years. The code's been like that since Berkeley. > >> That can be fixed by b-tree equality for boxes, but we need some >> decisions there. > > The problem with inventing a btree opclass for boxes is much more > fundamental than fuzzy comparisons, unfortunately. Btree requires a > linear sort order, and there's no plausible linear ordering of boxes, > unless you compare areas which won't give the equality semantics you want. > > We could perhaps invent an exact-equality operator and construct just > a hash opclass for it, no btree. > > In any case I think it would be a mistake to consider only boxes; all > the built-in geometric types have related issues. > > regards, tom lane > >