Re: Proximity query with GIST and row estimation

Поиск
Список
Период
Сортировка
От Paul Ramsey
Тема Re: Proximity query with GIST and row estimation
Дата
Msg-id 0BD9E842-2DA7-4F90-A732-AFD60C1F50C7@refractions.net
обсуждение исходный текст
Ответ на Proximity query with GIST and row estimation  ("Guillaume Smet" <guillaume.smet@gmail.com>)
Ответы Re: Proximity query with GIST and row estimation  ("Guillaume Smet" <guillaume.smet@gmail.com>)
Список pgsql-performance
You'll find that PostGIS does a pretty good job of selectivity
estimation.

P

On 13-Feb-07, at 9:09 AM, Guillaume Smet wrote:

> Hi all,
>
> Following the work on Mark Stosberg on this list (thanks Mark!), I
> optimized our slow proximity queries by using cube, earthdistance
> (shipped with contrib) and a gist index. The result is globally very
> interesting apart for a specific query and we'd like to be able to fix
> it too to be more consistent (it's currently faster with a basic
> distance calculation based on acos, cos and so on but it's slow
> anyway).
>
> The problem is that we have sometimes very few places near a given
> location (small city) and sometimes a lot of them (in Paris, Bruxelles
> and so on - it's the case we have here). The gist index I created
> doesn't estimate the number of rows in the area very well.
>
> Table: lieu (100k rows) with wgslat and wgslon as numeric
> Table: lieugelieu (200k rows, 1k with codegelieu = 'PKG')
> Index: "idx_lieu_earth" gist (ll_to_earth(wgslat::double precision,
> wgslon::double precision))
>
> The simplified query is:
> SELECT DISTINCT l.numlieu, l.nomlieu, ROUND
> (earth_distance(ll_to_earth(48.85957600, 2.34860800),
> ll_to_earth(l.wgslat, l.wgslon))) as dist
>     FROM lieu l, lieugelieu lgl
>     WHERE lgl.codegelieu = 'PKG' AND earth_box(ll_to_earth(48.85957600,
> 2.34860800), 1750) @ ll_to_earth(l.wgslat, l.wgslon) AND lgl.numlieu =
> l.numlieu ORDER BY dist ASC LIMIT 2;
> It's used to find the nearest car parks from a given location.
>
> The plan is attached plan_earthdistance_nestedloop.txt. It uses a
> nested loop because the row estimate is pretty bad: (cost=0.00..3.38
> rows=106 width=0) (actual time=30.229..30.229 rows=5864 loops=1).
>
> If I disable the nested loop, the plan is different and faster (see
> plan_earthdistance_hash.txt attached).
>
> Is there any way to improve this estimation? I tried to set the
> statistics of wgslat and wgslon higher but it doesn't change anything
> (I don't know if the operator is designed to use the statistics).
>
> Any other idea to optimize this query is very welcome too.
>
> --
> Guillaume
> <plan_earthdistance_nestedloop.txt>
> <plan_earthdistance_hash.txt>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 6: explain analyze is your friend


В списке pgsql-performance по дате отправления:

Предыдущее
От: "Lou O'Quin"
Дата:
Сообщение: Re: How long should it take to insert 200,000 records?
Следующее
От: Josh Berkus
Дата:
Сообщение: Re: Benchmarking PGSQL?