Обсуждение: functions in WHERE clause
Hi All, I've got one of these: SELECT * from some_table WHERE test_for_equality_is_syntactically_ugly; What I'd like to do is encapsulate the WHERE clause in a function, but I'm having no end of trouble. The WHERE clause expects the function to return a boolean value. I can certainly return a boolean value from a function, but here it seems to me that what the function really has to do is return a set of boolean values -- the test in the WHERE clause sometimes evaluates to true and sometimes evaluates to false, and that is in turn used to constrain the query results. But you can't return a set of anything (I don't think) in a WHERE clause, because it seems to want a singular boolean value. Is it possible to do what I'm trying to do? I've written a few simple sql and pl/pgsql functions over the years, but I'm no expert. Perhaps I'm barking up the wrong tree here. It seems like what I really need is a way to have some kind of rewrite rule (e.g. when you see "foo(x)" substitute ugly string). Sort of like a Lisp macro ;) Steve -- Stephen Ramsay Assistant Professor Department of English University of Georgia email: sramsay@uga.edu web: http://cantor.english.uga.edu/ PGP Public Key ID: 0xA38D7B11
On Sun, Mar 05, 2006 at 10:16:52AM -0500, sramsay@uga.edu wrote: > I've got one of these: > > SELECT * from some_table WHERE > test_for_equality_is_syntactically_ugly; > > What I'd like to do is encapsulate the WHERE clause in a function, > but I'm having no end of trouble. Would a view work? If not then please provide a more concrete example that shows what you're trying to do. CREATE VIEW foo AS SELECT * FROM some_table WHERE test_for_equality_is_syntactically_ugly; -- Michael Fuhr
On Sun, Mar 05, 2006 at 10:26:35AM -0700, Michael Fuhr wrote: > On Sun, Mar 05, 2006 at 10:16:52AM -0500, sramsay@uga.edu wrote: > > I've got one of these: > > > > SELECT * from some_table WHERE > > test_for_equality_is_syntactically_ugly; > > > > What I'd like to do is encapsulate the WHERE clause in a function, > > but I'm having no end of trouble. > > Would a view work? If not then please provide a more concrete > example that shows what you're trying to do. > > CREATE VIEW foo AS > SELECT * FROM some_table > WHERE test_for_equality_is_syntactically_ugly; Thanks for reply! No, because test_for_equality_is_syntactically_ugly varies. Here's the specifics: I've written an XML database (of sorts) that uses the ltree contrib module to find elements that belong to particular nodes. Right now, if you wanted to find all the word tokens found within the titleStmt element of a set of XML documents, you'd do something like this: SELECT token FROM event WHERE ltree ~ '*.titleStmt.*'; (ltree contains a set of "paths" that together express the hierarchical relationships in the documents, so you can format that bit after the ~ to include any element, and you can also use other kinds of operators to get ancestors, children, etc.). Now, I realize this isn't *really* all that syntactically ugly, but it would be really nice if the user of the db could type in an XPath like so: SELECT token FROM event WHERE xpath("//titleStmt"); This will require some munging inside a function to go from the XPath to the ltree-style regex expression, but I'm finding that I'm having more basic troubles. You can't do this: CREATE FUNCTION xpath(lquery) RETURNS ltree AS $$ SELECT ltree FROM event WHERE ltree ~ $1; $$ LANGUAGE SQL; Because that only returns the first value from the SELECT (and it's not pl-pgsql anyway). But I also can't get this kind of thing to work: CREATE FUNCTION xpath(lquery) RETURNS SETOF ltree AS $$ DECLARE tree record; BEGINFOR tree IN SELECT ltree FROM event WHERE ltree ~ $1 LOOP RETURN NEXT tree;END LOOP;RETURN; END $$ LANGUAGE plpgsql; Because SETOF won't work in a WHERE context. Part of the problem is that I don't really understand how WHERE constraints work under the hood. The lvalue in the WHERE clause expects a boolean, but I think that really means "evaluate the truth or falsehood of this expression on all rows in the table specified in the FROM clause." So how do you get a function to play nicely with that concept? "The argument of WHERE must not return a set," as the psql shell keeps telling me, but is there anything it would accept that would make this work? Thanks again, Steve -- Stephen Ramsay Assistant Professor Department of English University of Georgia email: sramsay@uga.edu web: http://cantor.english.uga.edu/ PGP Public Key ID: 0xA38D7B11
sramsay@uga.edu writes: > I've got one of these: > SELECT * from some_table WHERE > test_for_equality_is_syntactically_ugly; > What I'd like to do is encapsulate the WHERE clause in a function, You mean like replacing SELECT * from some_table WHERE x = 42 AND y = 77 with create function mytest(int,int) returns bool as $$select $1 = 42 AND $2 = 77$$ language sql; SELECT * from some_table WHERE mytest(x,y); A SQL-language function like this should get inlined into the query, so that you don't lose any performance compared to writing out the full expression each time. regards, tom lane
sramsay@uga.edu writes: > You can't do this: > CREATE FUNCTION xpath(lquery) RETURNS ltree AS $$ > SELECT ltree FROM event WHERE ltree ~ $1; > $$ LANGUAGE SQL; That would work fine if you said RETURNS SETOF ltree. > But I also can't get this kind of thing to work: > CREATE FUNCTION xpath(lquery) RETURNS SETOF ltree AS $$ > DECLARE > tree record; > BEGIN > FOR tree IN SELECT ltree FROM event WHERE ltree ~ $1 LOOP > RETURN NEXT tree; > END LOOP; > RETURN; > END > $$ LANGUAGE plpgsql; That should work too, except that you are trying to return a record not an ltree value. Try "RETURN NEXT tree.ltree". > Because SETOF won't work in a WHERE context. Possibly you need to read the error messages you are getting more closely, because I'm pretty sure whatever it said had nothing to do with either SETOF or WHERE ... regards, tom lane
On Sun, Mar 05, 2006 at 01:16:40PM -0500, Tom Lane wrote: > sramsay@uga.edu writes: > That would work fine if you said RETURNS SETOF ltree. > > That should work too, except that you are trying to return a record > not an ltree value. Try "RETURN NEXT tree.ltree". > > > Because SETOF won't work in a WHERE context. > > Possibly you need to read the error messages you are getting more > closely, because I'm pretty sure whatever it said had nothing to > do with either SETOF or WHERE ... I think it does, actually. I can write functions that return ltrees, records, or sets of ltree, and they'll work in any part of the query -- except the WHERE clause. If the function returns anything other than a bool, it complains that the function must return a bool. Which makes sense to me, actually, because the "result" of something like: x = 42 and y = 77 (to quote your earlier example) should be true or false -- not a set of rows or records or types. At least I think. But let's return to your example for a moment, because it really does look like the kind of "rewrite" rule that I want here. You suggested replacing: SELECT * from some_table WHERE x = 42 AND y = 77 with create function mytest(int,int) returns bool as $$select $1 = 42 AND $2 = 77$$ language sql; So you could then do: SELECT * from some_table WHERE mytest(x,y); But imagine instead that this function is more generic. You know that you're trying to get something that's equal to x and equal to y, but you don't know (until the function is called) what those rvalues should be. In other words, it's the 42 and the 47 that you don't know until runtime -- you always know what columns your searching on. When I try do something along the lines of what you're doing, I get: CREATE OR REPLACE FUNCTION xpath(lquery) RETURNS bool AS $$SELECT ltree ~ $1; $$ LANGUAGE SQL; But I can't even load that function, because it says: ERROR: column "ltree" does not exist And round and round I go . . . Thanks for the reply, Steve -- Stephen Ramsay Assistant Professor Department of English University of Georgia email: sramsay@uga.edu web: http://cantor.english.uga.edu/ PGP Public Key ID: 0xA38D7B11
On Sun, 5 Mar 2006 sramsay@uga.edu wrote: > On Sun, Mar 05, 2006 at 01:16:40PM -0500, Tom Lane wrote: > > sramsay@uga.edu writes: > > That would work fine if you said RETURNS SETOF ltree. > > > > That should work too, except that you are trying to return a record > > not an ltree value. Try "RETURN NEXT tree.ltree". > > > > > Because SETOF won't work in a WHERE context. > > > > Possibly you need to read the error messages you are getting more > > closely, because I'm pretty sure whatever it said had nothing to > > do with either SETOF or WHERE ... > > I think it does, actually. I can write functions that return > ltrees, records, or sets of ltree, and they'll work in any part of > the query -- except the WHERE clause. If the function returns > anything other than a bool, it complains that the function must > return a bool. > > Which makes sense to me, actually, because the "result" of something > like: > > x = 42 and y = 77 > > (to quote your earlier example) should be true or false -- not a set > of rows or records or types. At least I think. > > But let's return to your example for a moment, because it really > does look like the kind of "rewrite" rule that I want here. > > You suggested replacing: > > SELECT * from some_table WHERE x = 42 AND y = 77 > > with > > create function mytest(int,int) returns bool as > $$select $1 = 42 AND $2 = 77$$ language sql; > > So you could then do: > > SELECT * from some_table WHERE mytest(x,y); > > But imagine instead that this function is more generic. You know > that you're trying to get something that's equal to x and equal to > y, but you don't know (until the function is called) what those > rvalues should be. In other words, it's the 42 and the 47 that you > don't know until runtime -- you always know what columns your > searching on. Then you need to pass those in as well as something representing the row that's being tested -- where clauses are filters on rows. The whole row representation might be better than columns for some cases. For example: create table tt1(a int, b int); create function f1(tt1, int) returns bool as 'select $1.a = $2' language 'sql'; select * from tt1 where f1(tt1, 1); --- The other option is to do this as a set returning function in the first place rather than trying to do a wierd where clause thing. create function f2(int) returns setof tt1 as 'select * from tt1 where a = $1' language 'sql'; select * from f2(1);
On Sun, Mar 05, 2006 at 11:38:46AM -0800, Stephan Szabo wrote: > > But imagine instead that this function is more generic. You know > > that you're trying to get something that's equal to x and equal to > > y, but you don't know (until the function is called) what those > > rvalues should be. In other words, it's the 42 and the 47 that you > > don't know until runtime -- you always know what columns your > > searching on. > > Then you need to pass those in as well as something representing the row > that's being tested -- where clauses are filters on rows. The whole row > representation might be better than columns for some cases. > > For example: > > create table tt1(a int, b int); > create function f1(tt1, int) returns bool as 'select $1.a = $2' language > 'sql'; > > select * from tt1 where f1(tt1, 1); Eureka! That does it: CREATE OR REPLACE FUNCTION xpath(event, lquery) RETURNS bool AS $$ SELECT $1.ltree ~ $2; $$ LANGUAGE SQL; And it's fast (the event table is a view)! Interestingly, this version is very slow: CREATE OR REPLACE FUNCTION xpath(event, lquery) RETURNS bool AS $$ BEGIN RETURN $1.ltree ~ $2; END $$ LANGUAGE plpgsql STABLE; EXPLAIN shows lots of sequential scans when you try to do it this way. No inlining, perhaps? I guess I am still a little confused as to why you have to pass the table in as a parameter -- why you can't just do: RETURN event.ltree ~ $2; But I'll live . . . Thanks to one and all for the very generous assistance. I have learned much. Steve > The other option is to do this as a set returning function in the first > place rather than trying to do a wierd where clause thing. Mostly for syntactic clarity on the caller's end, but I understand what you're saying. -- Stephen Ramsay Assistant Professor Department of English University of Georgia email: sramsay@uga.edu web: http://cantor.english.uga.edu/ PGP Public Key ID: 0xA38D7B11
On Sun, 5 Mar 2006 sramsay@uga.edu wrote: > On Sun, Mar 05, 2006 at 11:38:46AM -0800, Stephan Szabo wrote: > > > But imagine instead that this function is more generic. You know > > > that you're trying to get something that's equal to x and equal to > > > y, but you don't know (until the function is called) what those > > > rvalues should be. In other words, it's the 42 and the 47 that you > > > don't know until runtime -- you always know what columns your > > > searching on. > > > > Then you need to pass those in as well as something representing the row > > that's being tested -- where clauses are filters on rows. The whole row > > representation might be better than columns for some cases. > > > > For example: > > > > create table tt1(a int, b int); > > create function f1(tt1, int) returns bool as 'select $1.a = $2' language > > 'sql'; > > > > select * from tt1 where f1(tt1, 1); > > Eureka! That does it: > > CREATE OR REPLACE FUNCTION xpath(event, lquery) RETURNS bool AS $$ > SELECT $1.ltree ~ $2; > $$ LANGUAGE SQL; > > And it's fast (the event table is a view)! > > Interestingly, this version is very slow: > > CREATE OR REPLACE FUNCTION xpath(event, lquery) RETURNS bool AS > $$ > BEGIN > RETURN $1.ltree ~ $2; > END > $$ LANGUAGE plpgsql STABLE; > > EXPLAIN shows lots of sequential scans when you try to do it this > way. No inlining, perhaps? Yeah, I'd assume it's running the function once for each row after the construction of the event row (so it's probably doing any joins in the view first), whereas the SQL one might inline and be optimized differently which might shrink the number of rows that any joins are being done on. > I guess I am still a little confused as to why you have to pass the > table in as a parameter -- why you can't just do: > > RETURN event.ltree ~ $2; Well, the problem here is say you have a function f(int) that says RETURN tt1.a = $1 like my example. What does "select * from tt2 where f(1)" or "select * from tt1 a, tt1 b where f(1)" mean? Functions don't really have much context sensitivity, and it's not always reasonable to expect to be able to open the box to know things like f(x) in the above depends on there being a single tt1 entry in the from clause (and trying to deal with outer references would be an utter mess).
Tom Lane <tgl@sss.pgh.pa.us> writes: > A SQL-language function like this should get inlined into the query, > so that you don't lose any performance compared to writing out the > full expression each time. I think what's going on here is that he doesn't really want a function in the programming sense. A function takes arguments and returns a result based on those arguments. It would be trivial to make a function that returns true or false for his constraints but it would require passing in the columns as well as the values he's testing against. So it would be "xpath(ltreecolumn, 42, 47)". What he's looking for is a convenience macro that hides the columns he's testing against so he doesn't have to write the columns in every query. So he can just type "xpath(42,47)" without retyping "ltreecolumn". Afaik there's no functionality for this in Postgres. functions aren't given any context information when they're executed aside from the arguments passed. If I understand what you want then I think you're much better off just typing the name of the column you're testing against explicitly every time anyways. One day you'll need more flexibility or you'll have someone else reading the code and you'll be glad you're not hiding what's going on and hard coding column names inside some macro function anyways. -- greg
Hi, Steve, sramsay@uga.edu wrote: > SELECT * from some_table WHERE > test_for_equality_is_syntactically_ugly; > The WHERE clause expects the function to return a boolean value. I > can certainly return a boolean value from a function, but here it > seems to me that what the function really has to do is return a > set of boolean values -- the test in the WHERE clause sometimes > evaluates to true and sometimes evaluates to false, and that is in > turn used to constrain the query results. But you can't return a > set of anything (I don't think) in a WHERE clause, because it seems > to want a singular boolean value. And this is as it is intended. The equality test is applied row-by-row, and for each row, it is either true or false, but not undecided. So your query should look like SELECT * FROM some_table WHERE your_function(column_a, column_b); > Is it possible to do what I'm trying to do? I've written a few > simple sql and pl/pgsql functions over the years, but I'm no expert. Yes, it is. HTH, Markus -- Markus Schaber | Logical Tracking&Tracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org