Обсуждение: IN and ANY
The IN and ANY constructs works with different kinds of collections to the right. This is the current situation: SUB QUERY VALUE LIST ARRAY IN X X ANY X X The SQL standard specifies that = ANY should be equivalent to IN (including the value list case) but it is not. Take for example this: select 'a' = ANY ('{a,b}'); which according to the SQL spec should evaluate to False but in pg evaluates to True. Look for at this run in pg: dennis=# select 'a' IN ('{a,b}');?column? ----------f (1 rad) dennis=# select 'a' = ANY ('{a,b}');?column? ----------t I only have the SQL200x draft to check. And in section 8.4 it's clearly stated that "rvc IN ipv" is equivalent to "rvc = ANY ipv" and an ipv is defined to be either a subquery or a value list. *** some time later *** Hmm, the draft seems to be broken since I can only find ANY defined for subqueries in other sections, and not for value lists. Strange but not uncommon. Now I don't know what the standard says about this. Maybe someone with the sql99 spec wants to check. The question is also what makes sense. I'm not too fond of the above situation, but I'm not sure if anything can be done now. -- /Dennis Björklund
Dennis Bjorklund <db@zigo.dhs.org> writes: > Hmm, the draft seems to be broken since I can only find ANY defined for > subqueries in other sections, and not for value lists. Strange but not > uncommon. Now I don't know what the standard says about this. Maybe > someone with the sql99 spec wants to check. I think you are reading the term "equivalent" as meaning an equivalence in both directions. It looks to me that the spec's definition of <in predicate> is (mis)using the term to mean "is defined as". In SQL92 I see 1) Let IVL be an <in value list>. ( IVL ) is equivalent to the <table value constructor>: ( VALUES IVL ) ... 4) The expression RVC IN IPV is equivalent to RVC = ANY IPV These two rules together define both forms of IN in terms of the "= ANY (subquery)" construct. But surely the first rule is not meant to say that VALUES is a noise word. So this has to be a one-way implication. Accordingly I think you are in error to suggest that "= ANY (valuelist)" is supposed to work. I think ANY is only supposed to have a table subquery to the right. I don't have a strong opinion about "IN array", but am worried that allowing it would create ambiguity about which interpretation is meant. Is the left-hand side supposed to be compared against the whole array or each array member? regards, tom lane
Tom Lane wrote: > I don't have a strong opinion about "IN array", but am worried that > allowing it would create ambiguity about which interpretation is meant. > Is the left-hand side supposed to be compared against the whole array or > each array member? That was exactly the problem. IN (array) does work if the left-hand side is also an array, and the right-hand side is taken as a value list: regression=# select 1 where array[1] in (array[1]); ?column? ---------- 1 (1 row) regression=# select 1 where array[1] in (array[1,2]); ?column? ---------- (0 rows) There was discussion on this back in the May-June 2003 timeframe -- please check the archives. Joe
On Tue, 2 Mar 2004, Joe Conway wrote: > regression=# select 1 where array[1] in (array[1]); > ?column? > ---------- > 1 > (1 row) > > regression=# select 1 where array[1] in (array[1,2]); > ?column? > ---------- > (0 rows) The only reason I brought it up was that to me ANY and IN are pretty much the same kind of operator. Up until now I thought that IN really was the same as =ANY, but it isn't. I don't like that but of course I have to accept it. I would rather have had some elem function/operator for arrays like: 1 ELEM IN array[1,2] or 1 = ANY ELEM arra[1,2] But as usual I'm 6 months late with my objections :-) -- /Dennis Björklund
Dennis Bjorklund wrote: > The only reason I brought it up was that to me ANY and IN are pretty much > the same kind of operator. Up until now I thought that IN really was the > same as =ANY, but it isn't. I don't like that but of course I have to > accept it. I would rather have had some elem function/operator for arrays > like: I wrestled with the same issue last year. Offhand I don't know how hard it would be to do, but I wonder if when we see: argL IN (argR) we might be able to determine if argL and argR are of the same type or if argR is an array of argL, and act accordingly. But then we need to decide, what happens when we see: argL IN (argR1, argR2, ...) and argRn data type is an array of argL data type? Do we check all the argRn elements individually and return true if any of them equal argL? I'd guess so. > But as usual I'm 6 months late with my objections :-) An uncanny ability ;-) Joe
Joe Conway <mail@joeconway.com> writes: > But then we need to decide, what happens when we see: > argL IN (argR1, argR2, ...) > and argRn data type is an array of argL data type? Do we check all the > argRn elements individually and return true if any of them equal argL? > I'd guess so. This seems like an awfully mistake-prone "feature", even if it's theoretically not ambiguous. I think we should leave things as they are. We'd not be adding any actual functionality, only some notational consistency, and that doesn't seem worth the risk of confusion as to exactly what IN will do. [ wanders away trying to remember that quote about "foolish consistency" ] regards, tom lane
Tom Lane wrote: > [ wanders away trying to remember that quote about "foolish consistency" ] > http://www.bartleby.com/59/3/foolishconsi.html "A foolish consistency is the hobgoblin of little minds" A great person does not have to think consistently from one day to the next. This remark comes from the essay “Self-Reliance” by Ralph Waldo Emerson. Emerson does not explain the difference between foolish and wise consistency. Joe
Joe Conway <mail@joeconway.com> writes: > I wrestled with the same issue last year. > > Offhand I don't know how hard it would be to do, but I wonder if when we see: > > argL IN (argR) > > we might be able to determine if argL and argR are of the same type or if argR > is an array of argL, and act accordingly. That would be nice and simple and clean, except that postgres's type system doesn't work that way. It works backwards from every other language. Instead of every variable and function return value having a rigid type and forcing the interpretation of the surrounding expression, postgres seems to work top down. So in this case if argL or argR are functions or other expressions with unknown types it tries to figure out how to interpret them to produce the type it's looking for. In other words, what type those expressions are depends on what the expression expects. What would you do with "foo() IN (array[1,2])" if there are two functions called foo, one which returns an integer and one which returns an integer[] ? One of these days I'm going to suggest doing away with the whole "unknown" concept and make function return types not part of the polymorphic signature. That would make a lot of the type system weirdness go away. But it would mean breaking with a lot of tradition. And of course it would be completely and utterly not backwards compatible and I'll probably be laughed at. -- greg
Greg Stark <gsstark@mit.edu> writes: > That would be nice and simple and clean, except that postgres's type system > doesn't work that way. It works backwards from every other language. Instead > of every variable and function return value having a rigid type and forcing > the interpretation of the surrounding expression, postgres seems to work top > down. [blinks...] I don't think so. regards, tom lane
Greg Stark wrote: > So in this case if argL or argR are functions or other expressions with > unknown types it tries to figure out how to interpret them to produce the type > it's looking for. In other words, what type those expressions are depends on > what the expression expects. What would you do with "foo() IN (array[1,2])" if > there are two functions called foo, one which returns an integer and one which > returns an integer[] ? Well, first off, it isn't now, nor has it ever been (to my knowledge), possible to overload a function with more than one return type. It also isn't possible to declare a polymorphic return type unless at least one argument is polymorphic (meaning there must be at least one argument), in which case the return type is absolutely deterministic at time of execution. I'm not sure it's worth discussing this too much further, due to Tom's objection to the concept in general, but anyway...you should use the source. Starting with gram.y, we have this: r_expr: row IN_P select_with_parens { SubLink *n = makeNode(SubLink); n->subLinkType = ANY_SUBLINK; n->lefthand= $1; n->operName = makeList1(makeString("=")); n->subselect = $3; $$ = (Node *)n; } Here you can see that "IN" is actually transformed internally to an "= ANY" SubLink node. In parse_expr.c you'll see (around line 518 in current sources) where the ANY SubLink is processed. Note that exprType() is used on the expressions in order to determine (recursively) the argument data types, so that the appropriate operator can be picked. Hence, if we *wanted* to know if exprType(right_expr) was an array of exprType(left_expr), we could. > One of these days I'm going to suggest doing away with the whole "unknown" > concept and make function return types not part of the polymorphic signature. > That would make a lot of the type system weirdness go away. But it would mean > breaking with a lot of tradition. That doesn't make too much sense either. Polymorphic return types were not possible prior to 7.4. Joe
Uhm. oops. I'll just crawl back under this rock for a bit longer before I speak on this topic again. sorry. -- greg