Re: planner picking more expensive plan

Поиск
Список
Период
Сортировка
От John A Meinel
Тема Re: planner picking more expensive plan
Дата
Msg-id 42C55EA0.1000404@arbash-meinel.com
обсуждение исходный текст
Ответ на planner picking more expensive plan  (Sam Mason <sam@samason.me.uk>)
Ответы Re: planner picking more expensive plan
Список pgsql-performance
Sam Mason wrote:

>Hi,
>
>I've just been referred here after a conversion on IRC and everybody
>seemed to think I've stumbled upon some strangeness.
>
>The planner (in PG version 8.0.2) is choosing what it thinks is a more
>expensive plan.  I've got a table of animals (about 3M rows) and their
>movements (about 16M rows), and I'm trying to execute this query:
>
>  SELECT a.birthlocnid, m.locnid
>  FROM animals a
>    LEFT JOIN movements m ON (a.animalid = m.animalid AND m.mtypeid=0)
>  LIMIT 10;
>
>
>
Why are you using LIMIT without having an ORDER BY?
What are actually trying to get out of this query? Is it just trying to
determine where the 'home' locations are?
It just seems like this query isn't very useful. As it doesn't restrict
by animal id, and it just gets 10 randomly selected animals where
m.mtypeid=0.
And why a LEFT JOIN instead of a normal join?
Anyway, the general constraints you are applying seem kind of confusing.
What happens if you change the plan to:

  SELECT a.birthlocnid, m.locnid
  FROM animals a
    LEFT JOIN movements m ON (a.animalid = m.animalid AND m.mtypeid=0)
  ORDER BY a.animalid LIMIT 10;


I would guess that this would help the planner realize it should try to
use an index, since it can realize that it wants only a few rows by
a.animalid in order.
Though I also recognize that you aren't returning a.animalid so you
don't really know which animals you are returning.

I get the feeling you are trying to ask something like "do animals stay
at their birth location", or at least "how are animals moving around". I
don't know what m.typeid = 0 means, but I'm guessing it is something
like where their home is.

Anyway, I would say you need to put a little bit more restriction in, so
the planner can figure out how to get only 10 rows.

John
=:->

>If I have "work_mem" set to something small (1000) it uses this plan:
>
>    QUERY PLAN
>
> Limit  (cost=0.00..202.52 rows=10 width=8) (actual time=0.221..0.600 rows=10 loops=1)
>   ->  Merge Left Join  (cost=0.00..66888828.30 rows=3302780 width=8) (actual time=0.211..0.576 rows=10 loops=1)
>         Merge Cond: ("outer".animalid = "inner".animalid)
>         ->  Index Scan using animals_pkey on animals a  (cost=0.00..10198983.91 rows=3302780 width=8) (actual
time=0.112..0.276rows=10 loops=1) 
>         ->  Index Scan using movement_animal on movements m  (cost=0.00..56642740.73 rows=3107737 width=8) (actual
time=0.088..0.235rows=10 loops=1) 
>               Filter: (mtypeid = 0)
> Total runtime: 0.413 ms
>
>But if I increase "work_mem" to 10000 it uses this plan:
>
>    QUERY PLAN
>
> Limit  (cost=565969.42..566141.09 rows=10 width=8) (actual time=27769.047..27769.246 rows=10 loops=1)
>   ->  Merge Right Join  (cost=565969.42..57264070.77 rows=3302780 width=8) (actual time=27769.043..27769.228 rows=10
loops=1)
>         Merge Cond: ("outer".animalid = "inner".animalid)
>         ->  Index Scan using movement_animal on movements m  (cost=0.00..56642740.73 rows=3107737 width=8) (actual
time=0.022..0.154rows=10 loops=1) 
>               Filter: (mtypeid = 0)
>         ->  Sort  (cost=565969.42..574226.37 rows=3302780 width=8) (actual time=27768.991..27769.001 rows=10 loops=1)
>               Sort Key: a.animalid
>               ->  Seq Scan on animals a  (cost=0.00..77086.80 rows=3302780 width=8) (actual time=0.039..5620.651
rows=3303418loops=1) 
> Total runtime: 27851.097 ms
>
>
>I've tried playing with the statistics as people suggested on IRC but to
>no effect.  There was some discussion about why it would be doing this,
>but nothing obvious came out of it.
>
>SHOW ALL output is at the end of this mail but it should be pretty
>standard apart from:
>
>  shared_buffers = 10000
>  work_mem = 8192
>  max_connections = 100
>  effective_cache_size = 10000
>
>Hope that's enough information to be useful.
>
>Thanks.
>
>   Sam
>


Вложения

В списке pgsql-performance по дате отправления:

Предыдущее
От: Sam Mason
Дата:
Сообщение: Re: planner picking more expensive plan
Следующее
От: Sam Mason
Дата:
Сообщение: Re: planner picking more expensive plan