Re: Performance issues

Поиск
Список
Период
Сортировка
От Andy Colson
Тема Re: Performance issues
Дата
Msg-id 4D767FE2.9000707@squeakycode.net
обсуждение исходный текст
Ответ на Re: Performance issues  (Andreas Forø Tollefsen <andreasft@gmail.com>)
Список pgsql-performance
On 3/8/2011 10:58 AM, Andreas Forø Tollefsen wrote:
> Andy. Thanks. That is a great tips. I tried it but i get the error:
> NOTICE: ptarray_simplify returned a <2 pts array.
>
> Query:
> SELECT ST_Intersection(priogrid_land.cell,
> ST_Simplify(cshapeswdate.geom,0.1)) AS geom,
> priogrid_land.gid AS divider, gwcode, gwsyear, gweyear, startdate,
> enddate, capname, caplong, caplat, col, row, xcoord, ycoord
> FROM priogrid_land, cshapeswdate WHERE ST_Intersects(priogrid_land.cell,
> ST_Simplify(cshapeswdate.geom,0.1)) AND cshapeswdate.gwsyear <=1946 AND
> cshapeswdate.gweyear >=1946 AND cshapeswdate.startdate <= '1946/1/1';
>
>
> 2011/3/8 Andy Colson <andy@squeakycode.net <mailto:andy@squeakycode.net>>
>
>     I have seen really complex geometries cause problems.  If you have
>     thousands of points, when 10 would do, try ST_Simplify and see if it
>     doesnt speed things up.
>
>     -Andy
>
>
>
>     On 3/8/2011 2:42 AM, Andreas Forř Tollefsen wrote:
>
>         Hi. Thanks for the comments. My data is right, and the result is
>         exactly
>         what i want, but as you say i think what causes the query to be
>         slow is
>         the ST_Intersection which creates the intersection between the
>         vector
>         grid (fishnet) and the country polygons.
>         I will check with the postgis user list if they have any idea on
>         how to
>         speed up this query.
>
>         Best,
>         Andreas
>
>         2011/3/8 Tom Lane <tgl@sss.pgh.pa.us <mailto:tgl@sss.pgh.pa.us>
>         <mailto:tgl@sss.pgh.pa.us <mailto:tgl@sss.pgh.pa.us>>>
>
>
>             =?ISO-8859-1?Q?Andreas_For=F8_Tollefsen?=
>         <andreasft@gmail.com <mailto:andreasft@gmail.com>
>         <mailto:andreasft@gmail.com <mailto:andreasft@gmail.com>>> writes:
>          > This is a query i am working on now. It creates an intersection
>             of two
>          > geometries. One is a grid of 0.5 x 0.5 decimal degree sized
>             cells, while the
>          > other is the country geometries of all countries in the world for
>             a certain
>          > year.
>
>             Hm, are you sure your data is right?  Because the actual
>         rowcounts imply
>             that each country intersects about half of the grid cells,
>         which doesn't
>             seem right.
>
>          > priogrid=# EXPLAIN ANALYZE SELECT priogrid_land.gid, gwcode,
>          > ST_Intersection(pri
>          > ogrid_land.cell, cshapeswdate.geom) FROM priogrid_land,
>             cshapeswdate WHERE
>          > ST_Intersects(priogrid_land.cell, cshapeswdate.geom);
>          >
>                QUERY
>          > PLAN
>
>          >
>
>           --------------------------------------------------------------------------------
>          >
>         ------------------------------------------------------------------
>          >  Nested Loop  (cost=0.00..12644.85 rows=43351 width=87704)
>         (actual
>          > time=1.815..7
>          > 074973.711 rows=130331 loops=1)
>          >    Join Filter: _st_intersects(priogrid_land.cell,
>         cshapeswdate.geom)
>          >    ->  Seq Scan on cshapeswdate  (cost=0.00..14.42 rows=242
>             width=87248)
>          > (actual
>          >  time=0.007..0.570 rows=242 loops=1)
>          >    ->  Index Scan using idx_priogrid_land_cell on priogrid_land
>          >  (cost=0.00..7.1
>          > 5 rows=1 width=456) (actual time=0.069..5.604 rows=978 loops=242)
>          >          Index Cond: (priogrid_land.cell && cshapeswdate.geom)
>          >  Total runtime: 7075188.549 ms
>          > (6 rows)
>
>             AFAICT, all of the runtime is going into calculating the
>         ST_Intersects
>             and/or ST_Intersection functions.  The two scans are only
>         accounting for
>             perhaps 5.5 seconds, and the join infrastructure isn't going
>         to be
>             terribly expensive, so it's got to be those functions.  Not
>         knowing much
>             about PostGIS, I don't know if the functions themselves can
>         be expected
>             to be really slow.  If it's not them, it could be the cost
>         of fetching
>             their arguments --- in particular, I bet the country
>         outlines are very
>             large objects and are toasted out-of-line.  There's been
>         some past
>             discussion of automatically avoiding repeated detoastings in
>         scenarios
>             like the above, but nothing's gotten to the point of
>         acceptance yet.
>             Possibly you could do something to force detoasting in a
>         subquery.
>
>                                     regards, tom lane
>
>
>
>



ew... thats not good.  Seems like it simplified it down to a single
point?  (not 100% sure that's what the error means, just a guess)

Try getting some info about it:

select
   ST_Npoints(geom) As before,
   ST_NPoints(ST_Simplify(geom,0.1)) as after
from cshapeswdate


Also try things like ST_IsSimple ST_IsValid.  I seem to recall sometimes
needing ST_Points or st_NumPoints instead of ST_Npoints.

-Andy

В списке pgsql-performance по дате отправления:

Предыдущее
От: Mindaugas Riauba
Дата:
Сообщение: Re: Linux I/O schedulers - CFQ & random seeks
Следующее
От: Robert Haas
Дата:
Сообщение: Re: Query performance with disabled hashjoin and mergejoin