Обсуждение: Can function results be used in WHERE?
I have a function, the results of which seem to apply to ORDER BY and HAVING, but not to WHERE. Is this expected? -- Return distance in some mystery units (TODO: convert to miles or kilometers) CREATE FUNCTION calculate_distance(double precision, double precision, double precision, double precision) RETURNS double precision AS ' BEGIN RETURN (3963 * acos( sin($1/57.2958) * sin($3/57.2958) + cos($1/57.2958) * cos($3/57.2958) * cos($4/57.2958 - $2/57.2958) )); END; ' LANGUAGE plpgsql; demo=# select pod_code,lat,lon,calculate_distance(lat,lon,37.789629,-122.422082) from eg_pod where 4 > 5::double precision order by 4 limit 10;pod_code | lat | lon | calculate_distance ----------+-----------+-------------+-------------------- 44 | 0 | 0 | 0 45| 0 | 0 | 0 69 | 37.789629 | -122.422082 | 0 51 | 37.788166| -122.421488 | 0.106273303754946 71 | 37.794228 | -122.421382 | 0.320393524437476 73 | 37.787878 |-122.411644 | 0.583267064983836 37 | 37.791736 | -122.411604 | 0.590977027054446 46 | 37.784929 | -122.412782| 0.603416307249032 50 | 37.780329 | -122.418482 | 0.672685350683496 30 | 37.780419 | -122.417764| 0.679355355047995 (10 rows) sdemo=# select pod_code,lat,lon,calculate_distance(lat,lon,37.789629,-122.422082) from eg_pod having calculate_distance(lat,lon,37.789629,-122.422082) > 5 order by 4;pod_code | lat | lon | calculate_distance ----------+-----------+-------------+-------------------- 21 | 37.710581 | -122.468864 | 6.03655070159813 77| 37.805427 | -122.29528 | 7.01595024232628 29 | 37.802684 | -122.275976 | 8.0364304687727 12 | 37.806133| -122.273827 | 8.18282157050301 23 | 37.797327 | -122.26598 | 8.54878571904839 57 | 37.829592 |-122.266347 | 8.94791199923289 35 | 37.809327 | -122.25448 | 9.26077996779577 47 | 37.851957 | -122.270376| 9.34292370436932 demo=# select version(); version -----------------------------------------------------------------------------------------------------------PostgreSQL 7.4.12on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.2.3 20030502 (Red Hat Linux 3.2.3-20) (1 row) -- ---- Visit http://www.obviously.com/
On 7/10/06, Bryce Nesbitt <bryce1@obviously.com> wrote:
First I recommend making your function IMMUTABLE since, given the same arguments, it gives the same result - this will allow PostgreSQL to optimize the function call and cache the results. Then, don't use "4", use "calculate_distance(lat,lon, 37.789629,-122.422082)". That use is very ambiguous and subject to breaking if you change the columns in your select. It may also be the reason you have a problem though I don't use that syntax so cannot be sure.
The only difference between HAVING and WHERE is that WHERE occurs before a GROUP BY and HAVING occurs after. Since you have no GROUP BY there should be no difference in the queries. The only other difference is the "4 > 5::double precision" so that is where I would start.
-Aaron
I have a function, the results of which seem to apply to ORDER BY and
HAVING, but not to WHERE. Is this expected?
-- Return distance in some mystery units (TODO: convert to miles or
kilometers)
CREATE FUNCTION calculate_distance(double precision, double precision,
double precision, double precision) RETURNS double precision
AS '
BEGIN
RETURN (3963 * acos( sin($1/57.2958) * sin($3/57.2958) +
cos($1/57.2958) * cos($3/57.2958) * cos($4/57.2958 - $2/57.2958) ));
END;
'
LANGUAGE plpgsql;
demo=# select
pod_code,lat,lon,calculate_distance(lat,lon,37.789629,-122.422082) from
eg_pod where 4 > 5::double precision order by 4 limit 10;
pod_code | lat | lon | calculate_distance
----------+-----------+-------------+--------------------
44 | 0 | 0 | 0
45 | 0 | 0 | 0
69 | 37.789629 | -122.422082 | 0
51 | 37.788166 | -122.421488 | 0.106273303754946
71 | 37.794228 | -122.421382 | 0.320393524437476
73 | 37.787878 | -122.411644 | 0.583267064983836
37 | 37.791736 | -122.411604 | 0.590977027054446
46 | 37.784929 | -122.412782 | 0.603416307249032
50 | 37.780329 | -122.418482 | 0.672685350683496
30 | 37.780419 | -122.417764 | 0.679355355047995
(10 rows)
sdemo=# select
pod_code,lat,lon,calculate_distance(lat,lon,37.789629,-122.422082) from
eg_pod having calculate_distance(lat,lon,37.789629,-122.422082) > 5
order by 4;
pod_code | lat | lon | calculate_distance
----------+-----------+-------------+--------------------
21 | 37.710581 | -122.468864 | 6.03655070159813
77 | 37.805427 | -122.29528 | 7.01595024232628
29 | 37.802684 | -122.275976 | 8.0364304687727
12 | 37.806133 | -122.273827 | 8.18282157050301
23 | 37.797327 | -122.26598 | 8.54878571904839
57 | 37.829592 | -122.266347 | 8.94791199923289
35 | 37.809327 | - 122.25448 | 9.26077996779577
47 | 37.851957 | -122.270376 | 9.34292370436932
demo=# select version();
version
-----------------------------------------------------------------------------------------------------------
PostgreSQL 7.4.12 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.2.3
20030502 (Red Hat Linux 3.2.3-20)
(1 row)
First I recommend making your function IMMUTABLE since, given the same arguments, it gives the same result - this will allow PostgreSQL to optimize the function call and cache the results. Then, don't use "4", use "calculate_distance(lat,lon, 37.789629,-122.422082)". That use is very ambiguous and subject to breaking if you change the columns in your select. It may also be the reason you have a problem though I don't use that syntax so cannot be sure.
The only difference between HAVING and WHERE is that WHERE occurs before a GROUP BY and HAVING occurs after. Since you have no GROUP BY there should be no difference in the queries. The only other difference is the "4 > 5::double precision" so that is where I would start.
-Aaron
Aaron Bono wrote: > First I recommend making your function IMMUTABLE since, given the same > arguments, it gives the same result - this will allow PostgreSQL to > optimize the function call and cache the results. Will do! > Then, don't use "4", use "calculate_distance(lat,lon, > 37.789629,-122.422082)". That use is very ambiguous and subject to > breaking if you change the columns in your select. It may also be the > reason you have a problem though I don't use that syntax so cannot be > sure. I think it is ugly also, but no other syntax seems to work: stage=# select pod_code,lat,lon,calculate_distance(lat,lon,37.789629,-122.422082) as dist from eg_pod where dist < 1 order by dist desc limit 10; ERROR: column "dist" does not exist stage=# select pod_code,lat,lon,calculate_distance(lat,lon,37.789629,-122.422082) as dist from eg_pod where 4 < 1 order by dist desc limit 10; pod_code | lat | lon | dist ----------+-----------+-------------+------------------ 20 | 1 | 1 | 7962.56837300854 36 | 37.39424| -122.077673 | 33.2296275931303 45 | 37.426929 | -122.161922 | 28.8542985664155 44 | 37.422813 | -122.172403| 28.8253772580912 22 | 37.444638 | -122.156875 | 27.9378660315883 34 | 37.875915 | -122.257427 | 10.7947710258918 81 | 37.903325 | -122.29963 | 10.323500058406 33 | 37.868001 | -122.261818 | 10.2977353566856 17 | 37.873002 | -122.26968 | 10.1277713471574 14 | 37.869574 | -122.267937 | 10.0742861708283 (10 rows) > The only difference between HAVING and WHERE is that WHERE occurs > before a GROUP BY and HAVING occurs after. Since you have no GROUP BY > there should be no difference in the queries. The only other > difference is the "4 > 5::double precision" so that is where I would > start. WHERE does nothing in my example. HAVING filters the results according to distance. So there's got to be more to it. -- ---- Visit http://www.obviously.com/
On 7/10/06, Bryce Nesbitt <bryce1@obviously.com> wrote:
SELECT
pod_code,
lat,
lon,
calculate_distance(lat,lon,37.789629,-122.422082) as dist
FROM eg_pod
WHERE calculate_distance(lat,lon, 37.789629,-122.422082) < 1
ORDER BY calculate_distance(lat,lon,37.789629,-122.422082) desc limit 10;
I think it is ugly also, but no other syntax seems to work:
stage=# select
pod_code,lat,lon,calculate_distance(lat,lon,37.789629,-122.422082) as
dist from eg_pod where dist < 1 order by dist desc limit 10;
ERROR: column "dist" does not exist
SELECT
pod_code,
lat,
lon,
calculate_distance(lat,lon,37.789629,-122.422082) as dist
FROM eg_pod
WHERE calculate_distance(lat,lon, 37.789629,-122.422082) < 1
ORDER BY calculate_distance(lat,lon,37.789629,-122.422082) desc limit 10;
Aaron Bono wrote: > > > On 7/10/06, *Bryce Nesbitt* <bryce1@obviously.com > <mailto:bryce1@obviously.com>> wrote: > > > I think it is ugly also, but no other syntax seems to work: > > stage=# select > pod_code,lat,lon,calculate_distance(lat,lon,37.789629,-122.422082) as > dist from eg_pod where dist < 1 order by dist desc limit 10; > ERROR: column "dist" does not exist > > > SELECT > pod_code, > lat, > lon, > calculate_distance(lat,lon,37.789629,-122.422082) as dist > FROM eg_pod > WHERE calculate_distance(lat,lon, 37.789629,-122.422082) < 1 > ORDER BY calculate_distance(lat,lon,37.789629,-122.422082) desc limit 10; Yep, that works. I guess with IMMUTABLE it's even effecient. But I have to pass 6 arguments, not 2. Is there a way to make it look cleaner? I had expected using the column label (e.g. "dist") to work with WHERE, just as it does with ORDER BY. -Bryce -- ---- Visit http://www.obviously.com/
On 7/10/06, Bryce Nesbitt <bryce1@obviously.com> wrote:
You can also try
SELECT
SELECT
Aaron Bono wrote:
>
>
> On 7/10/06, *Bryce Nesbitt* <bryce1@obviously.com
> <mailto:bryce1@obviously.com>> wrote:
>
>
> I think it is ugly also, but no other syntax seems to work:
>
> stage=# select
> pod_code,lat,lon,calculate_distance(lat,lon,37.789629,-122.422082) as
> dist from eg_pod where dist < 1 order by dist desc limit 10;
> ERROR: column "dist" does not exist
>
>
> SELECT
> pod_code,
> lat,
> lon,
> calculate_distance(lat,lon,37.789629,-122.422082) as dist
> FROM eg_pod
> WHERE calculate_distance(lat,lon, 37.789629,-122.422082) < 1
> ORDER BY calculate_distance(lat,lon,37.789629,-122.422082) desc limit 10;
Yep, that works. I guess with IMMUTABLE it's even effecient.
But I have to pass 6 arguments, not 2. Is there a way to make it look
cleaner?
I had expected using the column label (e.g. "dist") to work with WHERE,
just as it does with ORDER BY.
You can also try
SELECT
pod_code,
lat,
lon,
dist
FROM (lat,
lon,
dist
SELECT
pod_code,
lat,
lon,
calculate_distance(lat,lon,37.789629,-122.422082) as dist
lat,
lon,
calculate_distance(lat,lon,37.789629,-122.422082) as dist
FROM eg_pod
) eg_prod
WHERE dist < 1
ORDER BY dist desc limit 10;
If the 37.789629 and -122.422082 are static values you can create a view for it. Otherwise you can create a function or stored procedure that takes 2 arguments and returns the results of the subquery.
Just some options. Not sure which you would prefer. I am sure there are more ways to do it.
-Aaron Bono
) eg_prod
WHERE dist < 1
ORDER BY dist desc limit 10;
If the 37.789629 and -122.422082 are static values you can create a view for it. Otherwise you can create a function or stored procedure that takes 2 arguments and returns the results of the subquery.
Just some options. Not sure which you would prefer. I am sure there are more ways to do it.
-Aaron Bono
Bryce Nesbitt <bryce1@obviously.com> writes: > stage=# select > pod_code,lat,lon,calculate_distance(lat,lon,37.789629,-122.422082) as > dist from eg_pod where 4 < 1 order by dist desc limit 10; > [ allegedly returns 10 rows ] I'm having a real hard time believing any of this: "WHERE 4 < 1" is a constant FALSE condition and cannot possibly return any rows ... unless basic integer arithmetic is broken on your platform? 4 is not less than 1. I think you're showing us a heavily edited version of your query rather than what you actually typed. But as far as the underlying misconception goes, you seem to think that "4" in the WHERE clause might somehow be taken as referring to the fourth SELECT result column (why you don't think that the "1" would likewise refer to the first result column isn't clear). This is not so. "4" means the numeric value four. There is a special case in ORDER BY and GROUP BY that an argument consisting of a simple integer literal constant will be taken as a reference to an output column. This is an ugly kluge IMHO, but it's somewhat defensible on the grounds that neither ordering nor grouping by a simple constant has any possible real-world use; so the special case doesn't break anything of interest. This would certainly not be so if we were to randomly replace integer constants in general WHERE conditions with non-constant values. regards, tom lane
On 7/10/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I agree whole heartedly with Tom, using the number in the ORDER BY is ugly and not recommended. Using column names is much easier to read and is much more maintainable by team members. I have to admit the 4 < 1 did confuse me at first.
-Aaron Bono
But as far as the underlying misconception goes, you seem to think that
"4" in the WHERE clause might somehow be taken as referring to the
fourth SELECT result column (why you don't think that the "1" would
likewise refer to the first result column isn't clear). This is not so.
"4" means the numeric value four. There is a special case in ORDER BY
and GROUP BY that an argument consisting of a simple integer literal
constant will be taken as a reference to an output column. This is an
ugly kluge IMHO, but it's somewhat defensible on the grounds that
neither ordering nor grouping by a simple constant has any possible
real-world use; so the special case doesn't break anything of interest.
This would certainly not be so if we were to randomly replace integer
constants in general WHERE conditions with non-constant values.
I agree whole heartedly with Tom, using the number in the ORDER BY is ugly and not recommended. Using column names is much easier to read and is much more maintainable by team members. I have to admit the 4 < 1 did confuse me at first.
-Aaron Bono
Tom Lane wrote: > But as far as the underlying misconception goes, you seem to think that > "4" in the WHERE clause might somehow be taken as referring to the > fourth SELECT result column (why you don't think that the "1" would > likewise refer to the first result column isn't clear). This is not so. > "4" means the numeric value four. There is a special case in ORDER BY > and GROUP BY that an argument consisting of a simple integer literal > constant will be taken as a reference to an output column. This is an > ugly kluge IMHO... Yeah, it was a longshot. I only tried it because the column label did NOT work, and I had some gut reaction to repeating the same function twice: stage=# SELECT pod_code, lat, lon, calculate_distance(lat,lon,37.789629,-122.422082) as dist FROM eg_pod WHERE dist < 1 ORDER BY dist desc limit 10; ERROR: column "dist" does not exist stage=# SELECT pod_code, lat, lon, calculate_distance(lat,lon,37.789629,-122.422082) as dist FROM eg_pod WHERE calculate_distance(lat,lon, 37.789629,-122.422082) < 1 ORDER BY dist desc limit 5;pod_code | lat | lon | dist ----------+-----------+-------------+------------------- 5 | 37.792022 | -122.404247 | 0.988808031847045 62 |37.780166 | -122.409615 | 0.944907273102541 4 | 37.798528 | -122.409582 | 0.919592583879426 86 | 37.777529 |-122.417982 | 0.866416010967029 68 | 37.789915 | -122.406926 | 0.82867104307647 (5 rows) stage=# select * from version(); version -----------------------------------------------------------------------------------------------------------PostgreSQL 7.4.12on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.2.3 20030502 (Red Hat Linux 3.2.3-20) (1 row) The asymetry between HAVING/WHILE and ORDER BY seems odd. Is there more to that story? -- ---- Visit http://www.obviously.com/
On 7/11/06, Bryce Nesbitt <bryce1@obviously.com> wrote:
As I mentioned before: the only difference between HAVING and WHERE is that WHERE occurs before a GROUP BY and HAVING occurs after.
You cannot use an alias from the select column list in your WHERE clause because the where is the criteria done BEFORE your columns are pulled out. This is especially noticable when doing a GROUP BY since the WHERE is done before the GROUP BY and the returned column values are gathered after the GROUP BY. If you want to use an alias, do a subquery and then put your where in the outer query.
I believe the ORDER BY is done last but that may be dependent on the database implementation. It does make sense to think of ORDER BY to be done last though. For that reason it can use the alias.
==================================================================
Aaron Bono
Aranya Software Technologies, Inc.
http://www.aranya.com
==================================================================
Tom Lane wrote:
> But as far as the underlying misconception goes, you seem to think that
> "4" in the WHERE clause might somehow be taken as referring to the
> fourth SELECT result column (why you don't think that the "1" would
> likewise refer to the first result column isn't clear). This is not so.
> "4" means the numeric value four. There is a special case in ORDER BY
> and GROUP BY that an argument consisting of a simple integer literal
> constant will be taken as a reference to an output column. This is an
> ugly kluge IMHO...
Yeah, it was a longshot. I only tried it because the column label did
NOT work, and I had some gut reaction to repeating the same function twice:
As I mentioned before: the only difference between HAVING and WHERE is that WHERE occurs before a GROUP BY and HAVING occurs after.
stage=# SELECT pod_code, lat, lon,
calculate_distance(lat,lon,37.789629,-122.422082) as dist FROM eg_pod
WHERE dist < 1 ORDER BY dist desc limit 10;
ERROR: column "dist" does not exist
You cannot use an alias from the select column list in your WHERE clause because the where is the criteria done BEFORE your columns are pulled out. This is especially noticable when doing a GROUP BY since the WHERE is done before the GROUP BY and the returned column values are gathered after the GROUP BY. If you want to use an alias, do a subquery and then put your where in the outer query.
I believe the ORDER BY is done last but that may be dependent on the database implementation. It does make sense to think of ORDER BY to be done last though. For that reason it can use the alias.
stage=# SELECT pod_code, lat, lon,
calculate_distance(lat,lon,37.789629,-122.422082) as dist FROM eg_pod
WHERE calculate_distance(lat,lon, 37.789629,-122.422082) < 1 ORDER BY
dist desc limit 5;
pod_code | lat | lon | dist
----------+-----------+-------------+-------------------
5 | 37.792022 | -122.404247 | 0.988808031847045
62 | 37.780166 | -122.409615 | 0.944907273102541
4 | 37.798528 | -122.409582 | 0.919592583879426
86 | 37.777529 | -122.417982 | 0.866416010967029
68 | 37.789915 | -122.406926 | 0.82867104307647
(5 rows)
==================================================================
Aaron Bono
Aranya Software Technologies, Inc.
http://www.aranya.com
==================================================================