Re: Simple query, 10 million records...MySQL ten times faster
От | Oleg Bartunov |
---|---|
Тема | Re: Simple query, 10 million records...MySQL ten times faster |
Дата | |
Msg-id | Pine.LNX.4.64.0704270843020.12152@sn.sai.msu.ru обсуждение исходный текст |
Ответ на | Re: Simple query, 10 million records...MySQL ten times faster (Scott Marlowe <smarlowe@g2switchworks.com>) |
Список | pgsql-performance |
Folks, we in astronomy permanently work with billiards objects with spherical atributes and have several sky-indexing schemes. See my page for links http://www.sai.msu.su/~megera/wiki/SkyPixelization We have q3c package for PostgreSQL available from q3c.sf.net, which we use in production with terabytes-sized database. Oleg On Thu, 26 Apr 2007, Scott Marlowe wrote: > On Tue, 2007-04-24 at 16:26, zardozrocks wrote: >> I have this table: >> >> CREATE TABLE test_zip_assoc ( >> id serial NOT NULL, >> f_id integer DEFAULT 0 NOT NULL, >> lat_radians numeric(6,5) DEFAULT 0.00000 NOT NULL, >> long_radians numeric(6,5) DEFAULT 0.00000 NOT NULL >> ); > > Like someone else mentioned numeric types are SLOW. See if you can use > integers, or at least floats. > > I also wonder if you might be better served with geometric types and > GiST indexes on them than using your lat / long grid. With a geometric > type, you could define the lat / long as a point and use geometric > operations with it. > > See the pgsql manual: > > http://www.postgresql.org/docs/8.1/static/datatype-geometric.html > http://www.postgresql.org/docs/8.1/static/functions-geometry.html > >> It's basically a table that associates some foreign_key (for an event, >> for instance) with a particular location using longitude and >> latitude. I'm basically doing a simple proximity search. I have >> populated the database with *10 million* records. I then test >> performance by picking 50 zip codes at random and finding the records >> within 50 miles with a query like this: > > I assume that there aren't 10 million zip codes, right? > > Are you storing the lat / long of the individual venues? Or the zip > codes? If you're storing the lat / long of the zips, then I can't > imagine there are 10 million zip codes. If you could use the lat / long > numbers to find the zip codes that are in your range, then join that to > a venue table that fks off of the zip code table, I would think it would > be much faster, as you'd have a smaller data set to trundle through. > >> SELECT id >> FROM test_zip_assoc >> WHERE >> lat_radians > 0.69014816041 >> AND lat_radians < 0.71538026567 >> AND long_radians > -1.35446228028 >> AND long_radians < -1.32923017502 >> >> >> On my development server (dual proc/dual core Opteron 2.8 Ghz with 4GB >> ram) this query averages 1.5 seconds each time it runs after a brief >> warmup period. In PostGreSQL it averages about 15 seconds. > > I wonder how well it would run if you had 10, 20, 30, 40 etc... users > running it at the same time. My guess is that you'll be very lucky to > get anything close to linear scaling in any database. That's because > this is CPU / Memory bandwidth intensive, so it's gonna kill your DB. > OTOH, if it was I/O bound you could throw more hardware at it (bb cache > RAID controller, etc) > >> Both of those times are too slow. I need the query to run in under a >> second with as many as a billion records. I don't know if this is >> possible but I'm really hoping someone can help me restructure my >> indexes (multicolumn?, multiple indexes with a 'where' clause?) so >> that I can get this running as fast as possible. > > You're trying to do a whole lot of processing in a little time. You're > either gonna have to accept a less exact answer (i.e. base it on zip > codes) or come up with some way of mining the data for the answers ahead > of time, kind of like a full text search for lat and long. > > So, have you tried what I suggested about increasing shared_buffers and > work_mem yet? > > ---------------------------(end of broadcast)--------------------------- > TIP 7: You can help support the PostgreSQL project by donating at > > http://www.postgresql.org/about/donate > Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
В списке pgsql-performance по дате отправления: