Обсуждение: haversine formula with postgreSQL
Hi! I am looking at the PHP/MySQL Google Maps API store locator example here: http://code.google.com/apis/maps/articles/phpsqlsearch.html And I'm trying to get this to work with PostgreSQL instead of MySQL. I've (slightly) modified the haversine formula part of my PHP script but I keep getting this error: Invalid query: ERROR: column "distance" does not exist LINE 1: ...ude ) ) ) ) AS distance FROM aaafacilities HAVING distance <... ^ I'm new to this, but it doesn't look like I need to create a column in my table for distance, or at least the directions didn't say to create a distance column. Here is my PHP with SQL: $query = sprintf("SELECT 'ID', 'FACILITY', 'ADDRESS', latitude, longitude, ( 3959 * acos( cos( radians('%s') ) * cos( radians ( latitude ) ) * cos( radians( longitude ) - radians('%s') ) + sin ( radians('%s') ) * sin( radians( latitude ) ) ) ) AS distance FROM aaafacilities HAVING distance < '%s' ORDER BY dist LIMIT 0 OFFSET 20", pg_escape_string($center_lat), pg_escape_string($center_lng), pg_escape_string($center_lat), pg_escape_string($radius)); Does anyone have any ideas on how I can get this to work? I'm not sure what is wrong, since it doesn't seem like I need to create a distance column and when I do create one, I get this: Invalid query: ERROR: column "aaafacilities.latitude" must appear in the GROUP BY clause or be used in an aggregate function Thanks for any comments or suggestions. I appreciate it. I'm new to this. Jonathan Harahush
On Thu, Sep 17, 2009 at 1:16 PM, Jonathan <jharahush@gmail.com> wrote: > Hi! > > I am looking at the PHP/MySQL Google Maps API store locator example > here: > > http://code.google.com/apis/maps/articles/phpsqlsearch.html > > And I'm trying to get this to work with PostgreSQL instead of MySQL. > > I've (slightly) modified the haversine formula part of my PHP script > but I keep getting this error: > > Invalid query: ERROR: column "distance" does not exist LINE > 1: ...ude ) ) ) ) AS distance FROM aaafacilities HAVING distance <... > ^ > > I'm new to this, but it doesn't look like I need to create a column in > my table for distance, or at least the directions didn't say to create > a distance column. > > Here is my PHP with SQL: > $query = sprintf("SELECT 'ID', 'FACILITY', 'ADDRESS', latitude, > longitude, ( 3959 * acos( cos( radians('%s') ) * cos( radians > ( latitude ) ) * cos( radians( longitude ) - radians('%s') ) + sin > ( radians('%s') ) * sin( radians( latitude ) ) ) ) AS distance FROM > aaafacilities HAVING distance < '%s' ORDER BY dist LIMIT 0 OFFSET 20", > pg_escape_string($center_lat), > pg_escape_string($center_lng), > pg_escape_string($center_lat), > pg_escape_string($radius)); > > Does anyone have any ideas on how I can get this to work? I'm not > sure what is wrong, since it doesn't seem like I need to create a > distance column and when I do create one, I get this: Is that really the whole query? Why a having with no group by? Can you do me a favor and print out $query instead of the php stuff? It might help you as well to troubleshoot to see the real query.
Jonathan <jharahush@gmail.com> writes: > Here is my PHP with SQL: > $query = sprintf("SELECT 'ID', 'FACILITY', 'ADDRESS', latitude, > longitude, ( 3959 * acos( cos( radians('%s') ) * cos( radians > ( latitude ) ) * cos( radians( longitude ) - radians('%s') ) + sin > ( radians('%s') ) * sin( radians( latitude ) ) ) ) AS distance FROM > aaafacilities HAVING distance < '%s' ORDER BY dist LIMIT 0 OFFSET 20", Sigh, you've been misled by MySQL's nonstandard behavior. You cannot refer to output columns of a query in its HAVING clause; it's disallowed per spec and not logically sensible either. The simplest way to deal with it is just to repeat the expression in HAVING. If you really really don't want to write it twice, you can use a subquery. regards, tom lane
This behavior kinda gets me sometimes too, especially in WHERE clauses.. I'm a bit curious as to why this is so bad. I could see why it would be expensive to do, since your clause wouldn't be indexed - but why is the syntax itself not allowed? Repeating the clause isn't gonna gain you any speed, and might make it even slower since the expression would have to be evaluated twice right? Perhaps I'm missing something.. Mike On Thu, Sep 17, 2009 at 2:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Jonathan <jharahush@gmail.com> writes: >> Here is my PHP with SQL: >> $query = sprintf("SELECT 'ID', 'FACILITY', 'ADDRESS', latitude, >> longitude, ( 3959 * acos( cos( radians('%s') ) * cos( radians >> ( latitude ) ) * cos( radians( longitude ) - radians('%s') ) + sin >> ( radians('%s') ) * sin( radians( latitude ) ) ) ) AS distance FROM >> aaafacilities HAVING distance < '%s' ORDER BY dist LIMIT 0 OFFSET 20", > > Sigh, you've been misled by MySQL's nonstandard behavior. You cannot > refer to output columns of a query in its HAVING clause; it's disallowed > per spec and not logically sensible either. The simplest way to deal > with it is just to repeat the expression in HAVING. If you really > really don't want to write it twice, you can use a subquery. > > regards, tom lane > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general >
It's the whole query as far as I can tell. The app takes input from the user --- the user enters an address and chooses a radius ("show me all facilities within 5 miles of this address") and then the latitude and longitude of the address and the radius is passed into the query so that the database can grab all locations within a certain radius. The example provided by Google is using MySQL. The query for MySQL looks like this: SELECT address, name, lat, lng, ( 3959 * acos( cos( radians('%s') ) * cos( radians( lat ) ) * cos( radians( lng ) - radians('%s') ) + sin ( radians('%s') ) * sin( radians( lat ) ) ) ) AS distance FROM markers HAVING distance < '%s' ORDER BY distance LIMIT 0 , 20 And I'm attempting to change it to work with Postgres and have done this: SELECT 'ID', 'FACILITY', 'ADDRESS', latitude, longitude, ( 3959 * acos ( cos( radians('%s') ) * cos( radians( latitude ) ) * cos( radians ( longitude ) - radians('%s') ) + sin( radians('%s') ) * sin( radians ( latitude ) ) ) ) AS distance FROM aaafacilities HAVING distance < '%s' ORDER BY distance LIMIT 0 OFFSET 20 aaafacilities is my table name in my posgres database. I'm sorry if this isn't enough info.. like I said, I'm new to this but definitely interested in learning and figuring this out! From what I can tell, the database is supposed to calculate and then output the distance of each "match" but it seems like in the MySQL example, it can do this without having an actual distance column in the database. Thanks again!
Mike Christensen <mike@kitchenpc.com> writes: > This behavior kinda gets me sometimes too, especially in WHERE clauses.. > I'm a bit curious as to why this is so bad. I could see why it would > be expensive to do, since your clause wouldn't be indexed - but why is > the syntax itself not allowed? It's not logically sensible: per the implicit execution model defined by the spec, the output list is not computed until after WHERE/HAVING/etc are evaluated, so it makes no sense to refer to output expressions in those clauses. As an example of why the execution ordering is important, you'd be pretty unhappy if this threw a division-by-zero error: select 1/avg(x) from tab group by y having avg(x) <> 0; While we could interpret such a reference as meaning to copy the output expression into the other clause, it doesn't seem like a particularly good idea to encourage confusion about what the syntax means. Also, allowing references to output column names here actually creates an ambiguity: for instance "distance" could also be a column name available from some table in the FROM clause. So it's not exactly cost-free to allow this; it will likely result in queries being silently interpreted in some way other than what the author expected. BTW, there are two cases where Postgres *does* allow such references: 1. In ORDER BY. This is mainly because the SQL spec used to require it. It's actually logically consistent because ORDER BY is notionally executed after forming the output expressions, but it's still confusing. The spec authors thought better of this idea and removed it in SQL99, but we're still stuck supporting it for backwards compatibility reasons. 2. In GROUP BY. This is, frankly, a mistake, and one I wish we could have a do-over on. Again we're stuck with it for compatibility reasons, but we're not likely to extend the mistake to other clauses. In both these cases, to reduce the scope for ambiguity problems we only allow references to output columns as simple ORDER or GROUP list items (for instance "ORDER BY distance" but not "ORDER BY abs(distance)"). This is all right because it's still frequently useful, but you'd seldom write a WHERE or HAVING clause that consisted *only* of an output-column name. So even if we did extend the behavior it wouldn't help much, unless we were to fling the doors wide open for ambiguity problems. regards, tom lane
A bit out in left field, Writing your own haversine in Postgres seems a bit like reinventing a wooden wheel when you gan get a free pneumatic one... Any reason not to just install PostGIS & fully support geometries & projections in Postgres? You can build the geometries provided to the functions on the fly from lat/lon coordinates stored as numerics in your SQL,so your DB structures don't even have to change if you don't want them to.. http://www.postgis.org/documentation/manual-1.4/ST_Distance_Sphere.html http://www.postgis.org/documentation/manual-1.4/ST_Distance_Spheroid.html HTH Brent Wood Brent Wood DBA/GIS consultant NIWA, Wellington New Zealand >>> Scott Marlowe <scott.marlowe@gmail.com> 09/18/09 11:35 AM >>> On Thu, Sep 17, 2009 at 1:16 PM, Jonathan <jharahush@gmail.com> wrote: > Hi! > > I am looking at the PHP/MySQL Google Maps API store locator example > here: > > http://code.google.com/apis/maps/articles/phpsqlsearch.html > > And I'm trying to get this to work with PostgreSQL instead of MySQL. > > I've (slightly) modified the haversine formula part of my PHP script > but I keep getting this error: > > Invalid query: ERROR: column "distance" does not exist LINE > 1: ...ude ) ) ) ) AS distance FROM aaafacilities HAVING distance <... > ^ > > I'm new to this, but it doesn't look like I need to create a column in > my table for distance, or at least the directions didn't say to create > a distance column. > > Here is my PHP with SQL: > $query = sprintf("SELECT 'ID', 'FACILITY', 'ADDRESS', latitude, > longitude, ( 3959 * acos( cos( radians('%s') ) * cos( radians > ( latitude ) ) * cos( radians( longitude ) - radians('%s') ) + sin > ( radians('%s') ) * sin( radians( latitude ) ) ) ) AS distance FROM > aaafacilities HAVING distance < '%s' ORDER BY dist LIMIT 0 OFFSET 20", > pg_escape_string($center_lat), > pg_escape_string($center_lng), > pg_escape_string($center_lat), > pg_escape_string($radius)); > > Does anyone have any ideas on how I can get this to work? I'm not > sure what is wrong, since it doesn't seem like I need to create a > distance column and when I do create one, I get this: Is that really the whole query? Why a having with no group by? Can you do me a favor and print out $query instead of the php stuff? It might help you as well to troubleshoot to see the real query. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general NIWA is the trading name of the National Institute of Water & Atmospheric Research Ltd.
I do have PostGIS installed and I use it for other things (geoserver), but I'm not knowledgeable enough about it to the point where I understand how to get it to work with the Google Maps API. I'll look into it. In the meantime, I was hoping to create something based off of the GMaps/PHP/MySQL example I referenced in an earlier post since I'm still learning.
The reason why I'm using Postgres is because it's installed at work. We don't use MySQL.
Thanks for all of the help so far! I appreciate it.
On Thu, Sep 17, 2009 at 5:50 PM, Brent Wood <b.wood@niwa.co.nz> wrote:
A bit out in left field,
Writing your own haversine in Postgres seems a bit like reinventing a wooden wheel when you gan get a free pneumatic one...
Any reason not to just install PostGIS & fully support geometries & projections in Postgres?
You can build the geometries provided to the functions on the fly from lat/lon coordinates stored as numerics in your SQL, so your DB structures don't even have to change if you don't want them to..
http://www.postgis.org/documentation/manual-1.4/ST_Distance_Sphere.html
http://www.postgis.org/documentation/manual-1.4/ST_Distance_Spheroid.html
HTH
Brent Wood
Brent Wood
DBA/GIS consultant
NIWA, Wellington
New Zealand
>>> Scott Marlowe <scott.marlowe@gmail.com> 09/18/09 11:35 AM >>>
On Thu, Sep 17, 2009 at 1:16 PM, Jonathan <jharahush@gmail.com> wrote:
> Hi!
>
> I am looking at the PHP/MySQL Google Maps API store locator example
> here:
>
> http://code.google.com/apis/maps/articles/phpsqlsearch.html
>
> And I'm trying to get this to work with PostgreSQL instead of MySQL.
>
> I've (slightly) modified the haversine formula part of my PHP script
> but I keep getting this error:
>
> Invalid query: ERROR: column "distance" does not exist LINE
> 1: ...ude ) ) ) ) AS distance FROM aaafacilities HAVING distance <...
> ^
>
> I'm new to this, but it doesn't look like I need to create a column in
> my table for distance, or at least the directions didn't say to create
> a distance column.>> pg_escape_string($center_lat),
> Here is my PHP with SQL:
> $query = sprintf("SELECT 'ID', 'FACILITY', 'ADDRESS', latitude,
> longitude, ( 3959 * acos( cos( radians('%s') ) * cos( radians
> ( latitude ) ) * cos( radians( longitude ) - radians('%s') ) + sin
> ( radians('%s') ) * sin( radians( latitude ) ) ) ) AS distance FROM
> aaafacilities HAVING distance < '%s' ORDER BY dist LIMIT 0 OFFSET 20",
> pg_escape_string($center_lng),
> pg_escape_string($center_lat),
> pg_escape_string($radius));
>
> Does anyone have any ideas on how I can get this to work? I'm not
> sure what is wrong, since it doesn't seem like I need to create a
> distance column and when I do create one, I get this:
Is that really the whole query? Why a having with no group by?
Can you do me a favor and print out $query instead of the php stuff?
It might help you as well to troubleshoot to see the real query.
--Sent via pgsql-general mailing list (pgsql-general@postgresql.org)NIWA is the trading name of the National Institute of Water & Atmospheric Research Ltd.
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On Thu, Sep 17, 2009 at 8:37 PM, Jonathan Harahush <jharahush@gmail.com> wrote: > I do have PostGIS installed and I use it for other things (geoserver), but > I'm not knowledgeable enough about it to the point where I understand how to > get it to work with the Google Maps API. I'll look into it. In the > meantime, I was hoping to create something based off of the GMaps/PHP/MySQL > example I referenced in an earlier post since I'm still learning. > The reason why I'm using Postgres is because it's installed at work. We > don't use MySQL. > > Thanks for all of the help so far! I appreciate it. The good news is that while postgresql is more persnickity about SQL standards and it feels like you're dealing with a harsh task master, the lessons learned are good ones. They'll translate to other database engines, and in the future when mysql grows more standards compliant too.
On Thu, 17 Sep 2009, Jonathan wrote: > Invalid query: ERROR: column "distance" does not exist LINE > 1: ...ude ) ) ) ) AS distance FROM aaafacilities HAVING distance <... > ^ You can't use distance in a HAVING clause if distance is computed as part of the query result. You can rewrite this to use a subquery instead: SELECT * FROM (SELECT 'ID', 'FACILITY', 'ADDRESS', latitude, longitude, ( 3959 * acos( cos( radians('%s') ) * cos( radians ( latitude ) ) * cos( radians( longitude ) - radians('%s') ) + sin ( radians('%s') ) * sin( radians( latitude ) ) ) ) AS distance FROM aaafacilities) HAVING distance < '%s' ORDER BY dist LIMIT 0 OFFSET 20 And I think that might work for you. The performance is going to be miserable for large stat sets, because it's going to scan the whole aaafacilities table every time and recompute every distance, but as an example goes it's probably acceptable. Be warned that this example is like a textbook example of how to introduce a SQL injection vulnerability into your code. Once you get the syntax right, you should be using a parameterized query here rather than generting the query using sprintf before exposing this code to the outside world. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
> And I think that might work for you. The performance is going to be > miserable for large stat sets, because it's going to scan the whole > aaafacilities table every time and recompute every distance, but as an > example goes it's probably acceptable. Something I did when implementing haversine in Oracle was first calculate a bounding box for lat/lon and only calculate the distance for cities w/in the bounding box. Scott
On Fri, Sep 18, 2009 at 5:50 PM, Scott Bailey <artacus@comcast.net> wrote: >> And I think that might work for you. The performance is going to be >> miserable for large stat sets, because it's going to scan the whole >> aaafacilities table every time and recompute every distance, but as an >> example goes it's probably acceptable. > > Something I did when implementing haversine in Oracle was first calculate a > bounding box for lat/lon and only calculate the distance for cities w/in the > bounding box. This is easily done in postgresql with gist operator on box type. In some cases you can actually optimize it even further...you can make a 'inner' bounding box that captures points that do not need the expensive distance function. So you use gist to pull points that 'may' be inside the distance, box overlap the points that 'must' be inside the distance, and distance function the rest. Also I highly suggest moving the distance calculation inside an immutable function. merlin
On Thu, Sep 17, 2009 at 2:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Jonathan <jharahush@gmail.com> writes:Sigh, you've been misled by MySQL's nonstandard behavior. You cannot
> Here is my PHP with SQL:
> $query = sprintf("SELECT 'ID', 'FACILITY', 'ADDRESS', latitude,
> longitude, ( 3959 * acos( cos( radians('%s') ) * cos( radians
> ( latitude ) ) * cos( radians( longitude ) - radians('%s') ) + sin
> ( radians('%s') ) * sin( radians( latitude ) ) ) ) AS distance FROM
> aaafacilities HAVING distance < '%s' ORDER BY dist LIMIT 0 OFFSET 20",
refer to output columns of a query in its HAVING clause; it's disallowed
per spec and not logically sensible either. The simplest way to deal
with it is just to repeat the expression in HAVING. If you really
really don't want to write it twice, you can use a subquery.
regards, tom lane
This practice is also a bad habit for MySQL users. I regularly see queries from users that have conditions that logically belong in the WHERE clause but the user shoves it into the HAVING. This is often done without a specific GROUP BY. The MySQL optimizer does not deal with this well.
When would it make logical sense to have a HAVING clause that deals with a column that is not inside a aggregating function?
--
Rob Wultsch
wultsch@gmail.com
Rob Wultsch <wultsch@gmail.com> writes: > On Thu, Sep 17, 2009 at 2:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Jonathan <jharahush@gmail.com> writes: >>> Here is my PHP with SQL: >>> $query = sprintf("SELECT 'ID', 'FACILITY', 'ADDRESS', latitude, >>> longitude, ( 3959 * acos( cos( radians('%s') ) * cos( radians >>> ( latitude ) ) * cos( radians( longitude ) - radians('%s') ) + sin >>> ( radians('%s') ) * sin( radians( latitude ) ) ) ) AS distance FROM >>> aaafacilities HAVING distance < '%s' ORDER BY dist LIMIT 0 OFFSET 20", >> >> Sigh, you've been misled by MySQL's nonstandard behavior. You cannot >> refer to output columns of a query in its HAVING clause; it's disallowed >> per spec and not logically sensible either. The simplest way to deal >> with it is just to repeat the expression in HAVING. If you really > This practice is also a bad habit for MySQL users. I regularly see queries > from users that have conditions that logically belong in the WHERE clause > but the user shoves it into the HAVING. This is often done without a > specific GROUP BY. The MySQL optimizer does not deal with this well. Actually, that reminds me that there's an even bigger nonstandard behavior here. HAVING is *not* just some weird alternative spelling of WHERE; it implies that the query is grouped. If you don't have a GROUP BY clause then the implication is that the query should yield just one row, and what's more it can't refer to any variables at all except within aggregate functions, since there are no grouped variables. MySQL gets this flat wrong. (In fairness, we used to get it wrong too, until we read the spec more closely.) > When would it make logical sense to have a HAVING clause that deals with a > column that is not inside a aggregating function? Well, HAVING is supposed to apply to the post-GROUP BY rows. So 99% of the time you would want the HAVING condition to involve an aggregate function --- otherwise you ought to put it in WHERE and filter away the unwanted rows sooner. However I can imagine having a very-expensive-to-execute test that you'd rather execute only once per group. regards, tom lane