Обсуждение: Can function results be used in WHERE?

Поиск
Список
Период
Сортировка

Can function results be used in WHERE?

От
Bryce Nesbitt
Дата:
I have a function, the results of which seem to apply to ORDER BY and
HAVING, but not to WHERE.  Is this expected?



-- Return distance in some mystery units (TODO: convert to miles or
kilometers)
CREATE FUNCTION calculate_distance(double precision, double precision,
double precision, double precision) RETURNS double precision   AS '
BEGIN   RETURN (3963 * acos( sin($1/57.2958) * sin($3/57.2958) +
cos($1/57.2958) * cos($3/57.2958) * cos($4/57.2958 - $2/57.2958) ));
END;
'   LANGUAGE plpgsql;



demo=# select
pod_code,lat,lon,calculate_distance(lat,lon,37.789629,-122.422082) from
eg_pod where 4 > 5::double precision order by 4 limit 10;pod_code |    lat    |     lon     | calculate_distance
----------+-----------+-------------+--------------------      44 |         0 |           0 |                  0
45|         0 |           0 |                  0      69 | 37.789629 | -122.422082 |                  0      51 |
37.788166| -122.421488 |  0.106273303754946      71 | 37.794228 | -122.421382 |  0.320393524437476      73 | 37.787878
|-122.411644 |  0.583267064983836      37 | 37.791736 | -122.411604 |  0.590977027054446      46 | 37.784929 |
-122.412782|  0.603416307249032      50 | 37.780329 | -122.418482 |  0.672685350683496      30 | 37.780419 |
-122.417764|  0.679355355047995
 
(10 rows)

sdemo=# select
pod_code,lat,lon,calculate_distance(lat,lon,37.789629,-122.422082) from
eg_pod having calculate_distance(lat,lon,37.789629,-122.422082) > 5
order by 4;pod_code |    lat    |     lon     | calculate_distance
----------+-----------+-------------+--------------------      21 | 37.710581 | -122.468864 |   6.03655070159813
77| 37.805427 |  -122.29528 |   7.01595024232628      29 | 37.802684 | -122.275976 |    8.0364304687727      12 |
37.806133| -122.273827 |   8.18282157050301      23 | 37.797327 |  -122.26598 |   8.54878571904839      57 | 37.829592
|-122.266347 |   8.94791199923289      35 | 37.809327 |  -122.25448 |   9.26077996779577      47 | 37.851957 |
-122.270376|   9.34292370436932
 



demo=# select version();                                                 version
-----------------------------------------------------------------------------------------------------------PostgreSQL
7.4.12on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.2.3
 
20030502 (Red Hat Linux 3.2.3-20)
(1 row)


-- 
----
Visit http://www.obviously.com/



Re: Can function results be used in WHERE?

От
"Aaron Bono"
Дата:
On 7/10/06, Bryce Nesbitt <bryce1@obviously.com> wrote:
I have a function, the results of which seem to apply to ORDER BY and
HAVING, but not to WHERE.  Is this expected?



-- Return distance in some mystery units (TODO: convert to miles or
kilometers)
CREATE FUNCTION calculate_distance(double precision, double precision,
double precision, double precision) RETURNS double precision
    AS '
BEGIN
    RETURN (3963 * acos( sin($1/57.2958) * sin($3/57.2958) +
cos($1/57.2958) * cos($3/57.2958) * cos($4/57.2958 - $2/57.2958) ));
END;
'
    LANGUAGE plpgsql;



demo=# select
pod_code,lat,lon,calculate_distance(lat,lon,37.789629,-122.422082) from
eg_pod where 4 > 5::double precision order by 4 limit 10;
pod_code |    lat    |     lon     | calculate_distance
----------+-----------+-------------+--------------------
       44 |         0 |           0 |                  0
       45 |         0 |           0 |                  0
       69 | 37.789629 | -122.422082 |                  0
       51 | 37.788166 | -122.421488 |  0.106273303754946
       71 | 37.794228 | -122.421382 |  0.320393524437476
       73 | 37.787878 | -122.411644 |  0.583267064983836
       37 | 37.791736 | -122.411604 |   0.590977027054446
       46 | 37.784929 | -122.412782 |  0.603416307249032
       50 | 37.780329 | -122.418482 |  0.672685350683496
       30 | 37.780419 | -122.417764 |  0.679355355047995
(10 rows)

sdemo=# select
pod_code,lat,lon,calculate_distance(lat,lon,37.789629,-122.422082) from
eg_pod having calculate_distance(lat,lon,37.789629,-122.422082) > 5
order by 4;
pod_code |    lat    |     lon     | calculate_distance
----------+-----------+-------------+--------------------
       21 | 37.710581 | -122.468864 |   6.03655070159813
       77 | 37.805427 |  -122.29528 |   7.01595024232628
       29 | 37.802684 | -122.275976 |     8.0364304687727
       12 | 37.806133 | -122.273827 |   8.18282157050301
       23 | 37.797327 |  -122.26598 |   8.54878571904839
       57 | 37.829592 | -122.266347 |   8.94791199923289
       35 | 37.809327 |  - 122.25448 |   9.26077996779577
       47 | 37.851957 | -122.270376 |   9.34292370436932



demo=# select version();
                                                  version
-----------------------------------------------------------------------------------------------------------
PostgreSQL 7.4.12 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.2.3
20030502 (Red Hat Linux 3.2.3-20)
(1 row)


First I recommend making your function IMMUTABLE since, given the same arguments, it gives the same result - this will allow PostgreSQL to optimize the function call and cache the results.  Then, don't use "4", use "calculate_distance(lat,lon, 37.789629,-122.422082)".  That use is very ambiguous and subject to breaking if you change the columns in your select.  It may also be the reason you have a problem though I don't use that syntax so cannot be sure.

The only difference between HAVING and WHERE is that WHERE occurs before a GROUP BY and HAVING occurs after.  Since you have no GROUP BY there should be no difference in the queries.  The only other difference is the "4 > 5::double precision" so that is where I would start.

-Aaron

Re: Can function results be used in WHERE?

От
Bryce Nesbitt
Дата:
Aaron Bono wrote:
> First I recommend making your function IMMUTABLE since, given the same
> arguments, it gives the same result - this will allow PostgreSQL to
> optimize the function call and cache the results. 
Will do!

> Then, don't use "4", use "calculate_distance(lat,lon,
> 37.789629,-122.422082)".  That use is very ambiguous and subject to
> breaking if you change the columns in your select.  It may also be the
> reason you have a problem though I don't use that syntax so cannot be
> sure.
I think it is ugly also, but no other syntax seems to work:

stage=# select
pod_code,lat,lon,calculate_distance(lat,lon,37.789629,-122.422082) as
dist from eg_pod where dist < 1 order by dist desc limit 10;
ERROR:  column "dist" does not exist

stage=# select
pod_code,lat,lon,calculate_distance(lat,lon,37.789629,-122.422082) as
dist from eg_pod where 4 < 1 order by dist desc limit 10;
pod_code |    lat    |     lon     |       dist
----------+-----------+-------------+------------------      20 |         1 |           1 | 7962.56837300854      36 |
37.39424| -122.077673 | 33.2296275931303      45 | 37.426929 | -122.161922 | 28.8542985664155      44 | 37.422813 |
-122.172403| 28.8253772580912      22 | 37.444638 | -122.156875 | 27.9378660315883      34 | 37.875915 | -122.257427 |
10.7947710258918     81 | 37.903325 |  -122.29963 |  10.323500058406      33 | 37.868001 | -122.261818 |
10.2977353566856     17 | 37.873002 |  -122.26968 | 10.1277713471574      14 | 37.869574 | -122.267937 |
10.0742861708283
(10 rows)


> The only difference between HAVING and WHERE is that WHERE occurs
> before a GROUP BY and HAVING occurs after.  Since you have no GROUP BY
> there should be no difference in the queries.  The only other
> difference is the "4 > 5::double precision" so that is where I would
> start.
WHERE does nothing in my example.
HAVING filters the results according to distance.
So there's got to be more to it.


-- 
----
Visit http://www.obviously.com/



Re: Can function results be used in WHERE?

От
"Aaron Bono"
Дата:


On 7/10/06, Bryce Nesbitt <bryce1@obviously.com> wrote:

I think it is ugly also, but no other syntax seems to work:

stage=# select
pod_code,lat,lon,calculate_distance(lat,lon,37.789629,-122.422082) as
dist from eg_pod where dist < 1 order by dist desc limit 10;
ERROR:  column "dist" does not exist

SELECT
   pod_code,
   lat,
   lon,
   calculate_distance(lat,lon,37.789629,-122.422082) as dist
FROM eg_pod
WHERE calculate_distance(lat,lon, 37.789629,-122.422082) < 1
ORDER BY calculate_distance(lat,lon,37.789629,-122.422082) desc limit 10; 

Re: Can function results be used in WHERE?

От
Bryce Nesbitt
Дата:
Aaron Bono wrote:
>
>
> On 7/10/06, *Bryce Nesbitt* <bryce1@obviously.com
> <mailto:bryce1@obviously.com>> wrote:
>
>
>     I think it is ugly also, but no other syntax seems to work:
>
>     stage=# select
>     pod_code,lat,lon,calculate_distance(lat,lon,37.789629,-122.422082) as
>     dist from eg_pod where dist < 1 order by dist desc limit 10;
>     ERROR:  column "dist" does not exist
>
>
> SELECT
>    pod_code,
>    lat,
>    lon,
>    calculate_distance(lat,lon,37.789629,-122.422082) as dist
> FROM eg_pod
> WHERE calculate_distance(lat,lon, 37.789629,-122.422082) < 1
> ORDER BY calculate_distance(lat,lon,37.789629,-122.422082) desc limit 10;
Yep, that works.  I guess with IMMUTABLE it's even effecient.
But I have to pass 6 arguments, not 2.  Is there a way to make it look
cleaner?
I had expected using the column label (e.g. "dist") to work with WHERE,
just as it does with ORDER BY.
                              -Bryce


-- 
----
Visit http://www.obviously.com/



Re: Can function results be used in WHERE?

От
"Aaron Bono"
Дата:
On 7/10/06, Bryce Nesbitt <bryce1@obviously.com> wrote:
Aaron Bono wrote:
>
>
> On 7/10/06, *Bryce Nesbitt* <bryce1@obviously.com
> <mailto:bryce1@obviously.com>> wrote:
>
>
>     I think it is ugly also, but no other syntax seems to work:
>
>     stage=# select
>     pod_code,lat,lon,calculate_distance(lat,lon,37.789629,-122.422082) as
>     dist from eg_pod where dist < 1 order by dist desc limit 10;
>     ERROR:  column "dist" does not exist
>
>
> SELECT
>    pod_code,
>    lat,
>    lon,
>    calculate_distance(lat,lon,37.789629,-122.422082) as dist
> FROM eg_pod
> WHERE calculate_distance(lat,lon, 37.789629,-122.422082) < 1
> ORDER BY calculate_distance(lat,lon,37.789629,-122.422082) desc limit 10;
Yep, that works.  I guess with IMMUTABLE it's even effecient.
But I have to pass 6 arguments, not 2.  Is there a way to make it look
cleaner?
I had expected using the column label (e.g. "dist") to work with WHERE,
just as it does with ORDER BY.

You can also try

SELECT
   pod_code,
   lat,
   lon,
   dist
FROM (
   SELECT
      pod_code,
      lat,
      lon,
      calculate_distance(lat,lon,37.789629,-122.422082) as dist
   FROM eg_pod
) eg_prod
WHERE dist < 1
ORDER BY dist desc limit 10;

If the 37.789629 and -122.422082 are static values you can create a view for it.  Otherwise you can create a function or stored procedure that takes 2 arguments and returns the results of the subquery.

Just some options.  Not sure which you would prefer.  I am sure there are more ways to do it.

-Aaron Bono
 

Re: Can function results be used in WHERE?

От
Tom Lane
Дата:
Bryce Nesbitt <bryce1@obviously.com> writes:
> stage=# select
> pod_code,lat,lon,calculate_distance(lat,lon,37.789629,-122.422082) as
> dist from eg_pod where 4 < 1 order by dist desc limit 10;
> [ allegedly returns 10 rows ]

I'm having a real hard time believing any of this: "WHERE 4 < 1" is
a constant FALSE condition and cannot possibly return any rows ...
unless basic integer arithmetic is broken on your platform?  4 is
not less than 1.  I think you're showing us a heavily edited version
of your query rather than what you actually typed.

But as far as the underlying misconception goes, you seem to think that
"4" in the WHERE clause might somehow be taken as referring to the
fourth SELECT result column (why you don't think that the "1" would
likewise refer to the first result column isn't clear).  This is not so.
"4" means the numeric value four.  There is a special case in ORDER BY
and GROUP BY that an argument consisting of a simple integer literal
constant will be taken as a reference to an output column.  This is an
ugly kluge IMHO, but it's somewhat defensible on the grounds that
neither ordering nor grouping by a simple constant has any possible
real-world use; so the special case doesn't break anything of interest.
This would certainly not be so if we were to randomly replace integer
constants in general WHERE conditions with non-constant values.
        regards, tom lane


Re: Can function results be used in WHERE?

От
"Aaron Bono"
Дата:
On 7/10/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:

But as far as the underlying misconception goes, you seem to think that
"4" in the WHERE clause might somehow be taken as referring to the
fourth SELECT result column (why you don't think that the "1" would
likewise refer to the first result column isn't clear).  This is not so.
"4" means the numeric value four.  There is a special case in ORDER BY
and GROUP BY that an argument consisting of a simple integer literal
constant will be taken as a reference to an output column.  This is an
ugly kluge IMHO, but it's somewhat defensible on the grounds that
neither ordering nor grouping by a simple constant has any possible
real-world use; so the special case doesn't break anything of interest.
This would certainly not be so if we were to randomly replace integer
constants in general WHERE conditions with non-constant values.

I agree whole heartedly with Tom, using the number in the ORDER BY is ugly and not recommended.  Using column names is much easier to read and is much more maintainable by team members.  I have to admit the 4 < 1 did confuse me at first.

-Aaron Bono

Re: Can function results be used in WHERE?

От
Bryce Nesbitt
Дата:
Tom Lane wrote:
> But as far as the underlying misconception goes, you seem to think that
> "4" in the WHERE clause might somehow be taken as referring to the
> fourth SELECT result column (why you don't think that the "1" would
> likewise refer to the first result column isn't clear).  This is not so.
> "4" means the numeric value four.  There is a special case in ORDER BY
> and GROUP BY that an argument consisting of a simple integer literal
> constant will be taken as a reference to an output column.  This is an
> ugly kluge IMHO...
Yeah, it was a longshot.  I only tried it because the column label did
NOT work, and I had some gut reaction to repeating the same function twice:


stage=# SELECT pod_code, lat, lon,
calculate_distance(lat,lon,37.789629,-122.422082) as dist FROM eg_pod
WHERE dist < 1 ORDER BY dist desc limit 10;
ERROR:  column "dist" does not exist

stage=# SELECT pod_code, lat, lon,
calculate_distance(lat,lon,37.789629,-122.422082) as dist FROM eg_pod
WHERE calculate_distance(lat,lon, 37.789629,-122.422082) < 1 ORDER BY
dist desc limit 5;pod_code |    lat    |     lon     |       dist
----------+-----------+-------------+-------------------       5 | 37.792022 | -122.404247 | 0.988808031847045      62
|37.780166 | -122.409615 | 0.944907273102541       4 | 37.798528 | -122.409582 | 0.919592583879426      86 | 37.777529
|-122.417982 | 0.866416010967029      68 | 37.789915 | -122.406926 |  0.82867104307647
 
(5 rows)
stage=# select * from
version();                                                   version
-----------------------------------------------------------------------------------------------------------PostgreSQL
7.4.12on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.2.3
 
20030502 (Red Hat Linux 3.2.3-20)
(1 row)


The asymetry between HAVING/WHILE and ORDER BY seems odd.  Is there more
to that story?


-- 
----
Visit http://www.obviously.com/



Re: Can function results be used in WHERE?

От
"Aaron Bono"
Дата:
On 7/11/06, Bryce Nesbitt <bryce1@obviously.com> wrote:
Tom Lane wrote:
> But as far as the underlying misconception goes, you seem to think that
> "4" in the WHERE clause might somehow be taken as referring to the
> fourth SELECT result column (why you don't think that the "1" would
> likewise refer to the first result column isn't clear).  This is not so.
> "4" means the numeric value four.  There is a special case in ORDER BY
> and GROUP BY that an argument consisting of a simple integer literal
> constant will be taken as a reference to an output column.  This is an
> ugly kluge IMHO...
Yeah, it was a longshot.  I only tried it because the column label did
NOT work, and I had some gut reaction to repeating the same function twice:

As I mentioned before: the only difference between HAVING and WHERE is that WHERE occurs before a GROUP BY and HAVING occurs after. 

stage=# SELECT pod_code, lat, lon,
calculate_distance(lat,lon,37.789629,-122.422082) as dist FROM eg_pod
WHERE dist < 1 ORDER BY dist desc limit 10;
ERROR:  column "dist" does not exist

You cannot use an alias from the select column list in your WHERE clause because the where is the criteria done BEFORE your columns are pulled out.  This is especially noticable when doing a GROUP BY since the WHERE is done before the GROUP BY and the returned column values are gathered after the GROUP BY.  If you want to use an alias, do a subquery and then put your where in the outer query.

I believe the ORDER BY is done last but that may be dependent on the database implementation.  It does make sense to think of ORDER BY to be done last though.  For that reason it can use the alias.

stage=# SELECT pod_code, lat, lon,
calculate_distance(lat,lon,37.789629,-122.422082) as dist FROM eg_pod
WHERE calculate_distance(lat,lon, 37.789629,-122.422082) < 1 ORDER BY
dist desc limit 5;
pod_code |    lat    |     lon     |       dist
----------+-----------+-------------+-------------------
        5 | 37.792022 | -122.404247 | 0.988808031847045
       62 | 37.780166 | -122.409615 | 0.944907273102541
        4 | 37.798528 | -122.409582 | 0.919592583879426
       86 | 37.777529 | -122.417982 | 0.866416010967029
       68 | 37.789915 | -122.406926 |  0.82867104307647
(5 rows)

==================================================================
   Aaron Bono
   Aranya Software Technologies, Inc.
   http://www.aranya.com
==================================================================