[GENERAL] Improve PostGIS performance with 62 million rows?

Поиск
Список
Период
Сортировка
От Israel Brewster
Тема [GENERAL] Improve PostGIS performance with 62 million rows?
Дата
Msg-id 7A08C908-B22F-48C8-976E-713A01FA04CC@ravnalaska.net
обсуждение исходный текст
Ответы Re: [GENERAL] Improve PostGIS performance with 62 million rows?  (Paul Ramsey <pramsey@cleverelephant.ca>)
Список pgsql-general
I have a database (PostgreSQL 9.6.1) containing 62,702,675 rows of latitude (numeric), longitude(numeric), elevation(integer) data, along with a PostGIS (2.3.0) geometry column (location), running on a CentOS 6.8 box with 64GB RAM and a RAID10 SSD data drive. I'm trying to get the maximum elevation along a path, for which purpose I've come up with the following query (for one particular path example):

SELECT elevation FROM data                                                                                                                                                                                                                                                                                                                                                                                WHERE ST_DWithin(location, ST_GeographyFromText('SRID=4326;LINESTRING(-150.008056 61.179167,-156.77 71.285833)'), 600)                                                                                                                                                                                                                                                                              ORDER BY elevation LIMIT 1;

The EXPLAIN ANALYZE output of this particular query (https://explain.depesz.com/s/heZ) shows:

                                                                                                                                                                      QUERY PLAN                                                                                                                                                                      
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=4.83..4.83 rows=1 width=4) (actual time=22653.840..22653.842 rows=1 loops=1)
   ->  Sort  (cost=4.83..4.83 rows=1 width=4) (actual time=22653.837..22653.837 rows=1 loops=1)
         Sort Key: elevation DESC
         Sort Method: top-N heapsort  Memory: 25kB
         ->  Index Scan using location_gix on data  (cost=0.42..4.82 rows=1 width=4) (actual time=15.786..22652.041 rows=11081 loops=1)
               Index Cond: (location && '0102000020E6100000020000002C11A8FE41C062C0DFC2BAF1EE964E40713D0AD7A39863C086C77E164BD25140'::geography)
               Filter: (('0102000020E6100000020000002C11A8FE41C062C0DFC2BAF1EE964E40713D0AD7A39863C086C77E164BD25140'::geography && _st_expand(location, '600'::double precision)) AND _st_dwithin(location, '0102000020E6100000020000002C11A8FE41C062C0DFC2BAF1EE964E40713D0AD7A39863C086C77E164BD25140'::geography, '600'::double precision, true))
               Rows Removed by Filter: 4934534
 Planning time: 0.741 ms
 Execution time: 22653.906 ms
(10 rows)

So it is using the index properly, but still takes a good 22 seconds to run, most of which appears to be in the Index Scan.

Is there any way to improve this, or is this going to be about as good as it gets with the number of rows being dealt with? I was planning to use this for a real-time display - punch in a couple of points, get some information about the route between, including maximum elevation - but with it taking 22 seconds for the longer routes at least, that doesn't make for the best user experience.

It's perhaps worth noting that the example above is most likely a worst case scenario. I would expect the vast majority of routes to be significantly shorter, and I want to say the shorter routes query much faster [testing needed]. That said, the faster the better, even for short routes :-)
-----------------------------------------------
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
-----------------------------------------------



Вложения

В списке pgsql-general по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: [GENERAL] Postgres 9.6.1 big slowdown by upgrading 8.4.22
Следующее
От: Paul Ramsey
Дата:
Сообщение: Re: [GENERAL] Improve PostGIS performance with 62 million rows?