Обсуждение: Advice on geolocation

Поиск
Список
Период
Сортировка

Advice on geolocation

От
Bruno Wolff III
Дата:
I use a static postgresql database to serve web pages containing information
about board game players and their ratings in various games. There are
about 7000 players listed and about 300 games. I don't expect this to grow
quickly in the short run. Most of the players are located in the US.

One of the planned improvements is to publish contact information and
to let people search for other people who live nearby and have common
interests in games. I have street addresses for most of these people,
though they currenty aren't in the database. I am also in the process
of using Daniel Egnor's geocoder project (for the Google contest)
working so that I can get Latitude and Longitude for most of these
addresses. I expect to be able to do this for something like 5000
people.

I am looking for hints on how to best use this information for radius
queries. I have tried looking at Gist and PostGIS documentation.
It looks like for Gist I would need to create a new datatype (since
I don't want to use a flat earth model making point unsuitable) and
figure out an efficient indexing scheme. I am pretty sure this is beyond my
capability to do (in the short run at least). It wasn't clear to me that
PostGIS would be fast for the small amount of data I have as their data
models looked pretty general and probably had a lot of overhead. Can people
recommend using either of the above approaches or any other open source
solution?

I also am not sure if using 2d lat+long or 3d rectangular is a better way
to go. 3d is simpler as there is no wrap around. If I use a spherical
model for the earth, then for any great circle distance I can compute
a secant distance and using that as a bounding box distance for searches.
This seems a lot easier than doing things with lat+long. If I do this
there may be tricks for setting up the coordinates so that indexes work
better. Since initially I will only have US data (and even later I expect
to have mostly US data) setting up coordinates so that they are used from
most discrimating to least discrinimating seems to be a good idea. I didn't
see much on this issue at the main parts of the PostGIS site, but I haven't
tried to read any of the referenced papers so far.

Re: Advice on geolocation

От
Chris Albertson
Дата:
I've done quite a bit of this as I used to work on a GIS
product.  Given to lat,lon pairs it is not hard to find
the great circle distance between them.  This assumes a
spherical Earth but good enough for your purposes as the
error will be under a few hundred meters

What you do is assume a plane defined by three points: your two
geolocations and the Earth's center. Next assume a circle
the same radius as the Earth (look this up) with center
at the Earth's center.  Your two geolocations are on the
edge of the circle.  Find the angle between the points then
divide that angle over 360 degrees and multiply by the
circumference of the circle (Earth).

So we have converted a spherical geometry problem (hard)
into a plain geometry problem (easier high school math)

The trick to computing the angle between the two locations
is to convert lat,long, at one Earth radius to (X,Y,Z)
then you have two vectors and can use length,length,length
to define a plain triangle.

This is an easy one because you can convert it basic plain
trig and you don't have to hunt down a specialist in
spherical trigonometry


--- Bruno Wolff III <bruno@wolff.to> wrote:
> I use a static postgresql database to serve web pages containing
> information
> about board game players and their ratings in various games. There
> are
> about 7000 players listed and about 300 games. I don't expect this to
> grow
> quickly in the short run. Most of the players are located in the US.
>
> One of the planned improvements is to publish contact information and
> to let people search for other people who live nearby and have common
> interests in games. I have street addresses for most of these people,
> though they currenty aren't in the database. I am also in the process
> of using Daniel Egnor's geocoder project (for the Google contest)
> working so that I can get Latitude and Longitude for most of these
> addresses. I expect to be able to do this for something like 5000
> people.
>
> I am looking for hints on how to best use this information for radius
> queries. I have tried looking at Gist and PostGIS documentation.
> It looks like for Gist I would need to create a new datatype (since
> I don't want to use a flat earth model making point unsuitable) and
> figure out an efficient indexing scheme. I am pretty sure this is
> beyond my
> capability to do (in the short run at least). It wasn't clear to me
> that
> PostGIS would be fast for the small amount of data I have as their
> data
> models looked pretty general and probably had a lot of overhead. Can
> people
> recommend using either of the above approaches or any other open
> source
> solution?
>
> I also am not sure if using 2d lat+long or 3d rectangular is a better
> way
> to go. 3d is simpler as there is no wrap around. If I use a spherical
> model for the earth, then for any great circle distance I can compute
> a secant distance and using that as a bounding box distance for
> searches.
> This seems a lot easier than doing things with lat+long. If I do this
> there may be tricks for setting up the coordinates so that indexes
> work
> better. Since initially I will only have US data (and even later I
> expect
> to have mostly US data) setting up coordinates so that they are used
> from
> most discrimating to least discrinimating seems to be a good idea. I
> didn't
> see much on this issue at the main parts of the PostGIS site, but I
> haven't
> tried to read any of the referenced papers so far.
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html


=====
Chris Albertson
  Home:   310-376-1029  chrisalbertson90278@yahoo.com
  Cell:   310-990-7550
  Office: 310-336-5189  Christopher.J.Albertson@aero.org

__________________________________________________
Do You Yahoo!?
Yahoo! Health - Feel better, live better
http://health.yahoo.com

Re: Advice on geolocation

От
Bruno Wolff III
Дата:
On Fri, Jul 26, 2002 at 09:33:53 -0700,
  Chris Albertson <chrisalbertson90278@yahoo.com> wrote:
>
> I've done quite a bit of this as I used to work on a GIS
> product.  Given to lat,lon pairs it is not hard to find
> the great circle distance between them.  This assumes a
> spherical Earth but good enough for your purposes as the
> error will be under a few hundred meters

My concern about this is that it will not be indexable. I suspect, but don't
know, that this won't really be an issue for the small number (~5000) points
of data (especially since I expect other constraints to be used in most
queries).

Re: Advice on geolocation

От
"Roderick A. Anderson"
Дата:
On Fri, 26 Jul 2002, Bruno Wolff III wrote:

[snip]

> My concern about this is that it will not be indexable. I suspect, but don't
> know, that this won't really be an issue for the small number (~5000) points
> of data (especially since I expect other constraints to be used in most
> queries).

Bruno,  check out PostGIS a bit more.  I believe you'll find that a lot of
the fuctionality you're wanting is there or coming soon.  I think there is
even a distance function.


Rod
--
  "Open Source Software - Sometimes you get more than you paid for..."


Re: Advice on geolocation

От
Chris Albertson
Дата:
--- Bruno Wolff III <bruno@wolff.to> wrote:
> On Fri, Jul 26, 2002 at 09:33:53 -0700,
>   Chris Albertson <chrisalbertson90278@yahoo.com> wrote:
> >
> > I've done quite a bit of this as I used to work on a GIS
> > product.  Given to lat,lon pairs it is not hard to find
> > the great circle distance between them.  This assumes a
> > spherical Earth but good enough for your purposes as the
> > error will be under a few hundred meters
>
> My concern about this is that it will not be indexable. I suspect,
> but don't
> know, that this won't really be an issue for the small number (~5000)
> points
> of data (especially since I expect other constraints to be used in
> most
> queries).

The way I handle the non-indexable problem in an astronomical
database the has tens of millions of stars in it is to fist
make a crude querry.  I SELECT all stars in a "box" then apply
the great circle method only to those.  For example if you
wanted to find evey point within 500 miles of some point first
assume a flat Earth and assume each degree equals 60 miles.
This will give you to many hits but you only need to apply
the more exact method to a small subset of the data.
For many purposes (say within a couple hundred miles.)
the "flat Earth" model is good enough if you assume degrees of
latitude = 60 miles and degrees of long. = cos(lat)*60 mile
per degree.  then simply use 60*sqrt(lat^2+long2) to
compute the distance (60 is for nautical miles can't
remember the stature mile conversion right now.)

For bigger distances like USA to Japan you really do need
the great circle method. BUt "flat Earth" works for inside the
same city


=====
Chris Albertson
  Home:   310-376-1029  chrisalbertson90278@yahoo.com
  Cell:   310-990-7550
  Office: 310-336-5189  Christopher.J.Albertson@aero.org

__________________________________________________
Do You Yahoo!?
Yahoo! Health - Feel better, live better
http://health.yahoo.com

Re: Advice on geolocation

От
Joe Conway
Дата:
Chris Albertson wrote:
> --- Bruno Wolff III <bruno@wolff.to> wrote:
>
>>On Fri, Jul 26, 2002 at 09:33:53 -0700,
>>  Chris Albertson <chrisalbertson90278@yahoo.com> wrote:
>>
>>>I've done quite a bit of this as I used to work on a GIS
>>>product.  Given to lat,lon pairs it is not hard to find
>>>the great circle distance between them.  This assumes a
>>>spherical Earth but good enough for your purposes as the
>>>error will be under a few hundred meters
>>
>>My concern about this is that it will not be indexable. I suspect,
>>but don't
>>know, that this won't really be an issue for the small number (~5000)
>>points
>>of data (especially since I expect other constraints to be used in
>>most
>>queries).

I haven't really followed this entire thread, but FWIW, here is a
plpgsql function to get great circle distance given a pair of lat/lons
(the Haversine formula). There is also something in contrib (see
contrib/earthdistance) to do this, but I've used this on a webhost where
I couldn't install my own C libraries.

CREATE FUNCTION "geodist" (float8,float8,float8,float8 ) RETURNS float8 AS '
DECLARE
   lat1 ALIAS FOR $1;
   lon1 ALIAS FOR $2;
   lat2 ALIAS FOR $3;
   lon2 ALIAS FOR $4;
   dist float8;
BEGIN

   dist := 0.621 * 6371.2 * 2 *
     atan2( sqrt(abs(0 + pow(sin(radians(lat2)/2 - radians(lat1)/2),2) +
     cos(radians(lat1)) * cos(radians(lat2)) * pow(sin(radians(lon2)/2 -
     radians(lon1)/2),2))),sqrt(abs(1 - pow(sin(radians(lat2)/2 -
     radians(lat1)/2),2) + cos(radians(lat1)) * cos(radians(lat2)) *
     pow(sin(radians(lon2)/2 - radians(lon1)/2),2))));

   return dist;
END;
' LANGUAGE 'plpgsql';


I used this for finding US zipcodes within a certain distance of the
given zipcode. To improve performance, I also used a "box" around the
lat/lon pairs:

. . .
WHERE
  abs(z.lat - $lat1d) * 60 * 1.15078 <= $dist
  and abs(z.long - $lon1d) * 60 * 1.15078 <= $dist
  and geodist($lat1d,$lon1d,lat,long) <= $dist
  and z.zip = az.zipcode
. . .

This limits the data being considered to a square area twice the
dimension of your desired distance, and then the distance calc further
restricts down to a circle of radius $dist. Hopefully you can get the
idea from this snippet.

HTH,

Joe