Behavior of equality_oper and ordering_oper
От | Tom Lane |
---|---|
Тема | Behavior of equality_oper and ordering_oper |
Дата | |
Msg-id | 18472.1060960936@sss.pgh.pa.us обсуждение исходный текст |
Ответы |
Re: Behavior of equality_oper and ordering_oper
(Joe Conway <mail@joeconway.com>)
|
Список | pgsql-hackers |
There are a couple of routines in parse_oper.c that are intended to look up appropriate '=' and '<' operators given a datatype OID. These are used in quite a variety of places, for example to determine the semantics of GROUP BY. It's bothered me for some time that these routines depend on actually looking for an operator named '='. (ordering_oper used to depend on looking for the name '<', too, but at the moment it looks for a merge sort operator associated with '='.) The system in general does not assume that operator names determine operator semantics, so depending on names here is a big wart. equality_oper tries to apply some additional checks to verify that '=' really does behave like equality, but that's just a kluge. What's worse, as of 7.3 the lookup results could vary depending on your schema search path. This is bad news, especially for user-defined types. Up to now I've seen no way around this; I've wished we could have a type's pg_type entry link directly to the proper '=' operator, but there are some circularity issues there. (CREATE TYPE couldn't set the link because the operator wouldn't exist yet when you're creating the type.) Today it occurred to me that we could look in pg_opclass for a default btree opclass for the datatype. If we find one, then the Equal and Less members of the opclass are the operators we want. (If we don't find one, we could try for a default hash opclass, which would give us Equal, but not Less, for a few additional datatypes.) This seems like a much cleaner approach for two reasons: the opclass structure declares directly that the operators have the semantics we are looking for, and the search is not dependent on schema visibility. (We only allow one default opclass per datatype/AM, so the result would be unique.) This lookup would perhaps be a little slower than the name-based operator lookup, since AFAICS there's no way to use the catcaches for it. However, we already realized we need a datatype-to-comparison-operator lookup cache to avoid memory leakage in array_cmp. If we cache the results in a specialized hashtable then it certainly won't be any slower than what the code does now. There are a few built-in types for which this would change the behavior. Presently, these operators are believed by equality_oper to be '=', but do not have any corresponding btree or hash opclass: =(tid,tid) | tideq=(box,box) | box_eq=("path","path") | path_n_eq=(tinterval,tinterval) | tintervaleq=(money,money) | cash_eq=(aclitem,aclitem) | aclitemeq=(circle,circle) | circle_eq=(lseg,lseg) | lseg_eq=(line,line) | line_eq In several of these cases, equality_oper is actually wrong --- box_eq for example compares areas, which is not what one would consider the normal equality behavior for boxes. The only ones that really ought to be found are the ones for TID, MONEY, and ACLITEM. I'm not particularly concerned about losing the ability to group by any of those datatypes, but if anyone is, we could talk about forcing an initdb to add the necessary comparison operators. There are also a small number of operators that are found by ordering_oper but would not be found by a btree opclass search: <(box,box) | box_eq<(reltime,reltime) | reltimeeq<(money,money) | cash_eq<(circle,circle) | circle_eq Again, I'm not too concerned about this; only MONEY actually has comparison semantics that would support a btree index, but if it's not btree-indexable does it need to be groupable? Comments, objections? regards, tom lane
В списке pgsql-hackers по дате отправления: