Обсуждение: Re: Perplexing Query Performance

Поиск
Список
Период
Сортировка

Re: Perplexing Query Performance

От
mdklatt@ou.edu (Michael Klatt)
Дата:
mdklatt@ou.edu (Michael Klatt) wrote in message news:<2cb75565.0204051614.4ad76291@posting.google.com>...
> I'm trying to track down a serious performance bottleneck and have
> noticed the following problem.  I have a rainfall database with
> approximately 1.2 million records.  Different resolutions (daily,
> monthly, etc) are implemented as views, where the main rainfall table
> is aggregated as appropriate.  I've been building up a query bit by
> bit to see where it bogs down, and I've found the straw that breaks
> the camel's back.
>

Another update, another problem.

Now the query is bogging down in a new place, and even VACUUM ANALYZE
hasn't fixed it as with my previous problem.

Consider this query:

select *
from daily_rainfall as p, sites as s
where p.site = s.ident and s.latitude >= -90 and s.latitude <= 90 and
  s.longitude >= -180 and terrain in ('A', 'I', 'L');

Almost any combination of 'A', 'I', and/or 'L' consistently executes
in the same time, but if it's ONLY 'A' PostgreSQL seems to go off into
Never Never Land.  There is no problem if 'I' or 'L' is used alone,
just 'A'.

Does anyone have any ideas?

Re: Perplexing Query Performance

От
Tom Lane
Дата:
mdklatt@ou.edu (Michael Klatt) writes:
> select *
> from daily_rainfall as p, sites as s
> where p.site = s.ident and s.latitude >= -90 and s.latitude <= 90 and
>   s.longitude >= -180 and terrain in ('A', 'I', 'L');

> Almost any combination of 'A', 'I', and/or 'L' consistently executes
> in the same time, but if it's ONLY 'A' PostgreSQL seems to go off into
> Never Never Land.  There is no problem if 'I' or 'L' is used alone,
> just 'A'.

Sounds to me like it's switching from a good query plan to a bad one...
but since you haven't shown us EXPLAIN output nor described the
available indexes, it's hard to say much.

            regards, tom lane

Re: Perplexing Query Performance

От
mdklatt@ou.edu (Michael Klatt)
Дата:
mdklatt@ou.edu (Michael Klatt) wrote in message news:<2cb75565.0204151456.193bde9@posting.google.com>...
> mdklatt@ou.edu (Michael Klatt) wrote in message news:<2cb75565.0204051614.4ad76291@posting.google.com>...
> Consider this query:
>
> select *
> from daily_rainfall as p, sites as s
> where p.site = s.ident and s.latitude >= -90 and s.latitude <= 90 and
>   s.longitude >= -180 and s.terrain in ('A', 'I', 'L');
>
> Almost any combination of 'A', 'I', and/or 'L' consistently executes
> in the same time, but if it's ONLY 'A' PostgreSQL seems to go off into
> Never Never Land.  There is no problem if 'I' or 'L' is used alone,
> just 'A'.
>

Here's some additional information to help diagnose the problem.


This is a "normal" query that executes in a reasonable amount of time.

SRDC=> explain select * from daily_rainfall as p, sites as s where
p.site = s.ident and s.latitude >= -90 and s.latitude <= 90 and
s.longitude >= -180 and s.longitude <= 180 and s.terrain in ('A', 'L',
'I');
NOTICE:  QUERY PLAN:

Nested Loop  (cost=99752.84..183759.80 rows=752 width=100)
  ->  Subquery Scan p  (cost=99752.84..104595.25 rows=38739 width=44)
        ->  Aggregate  (cost=99752.84..104595.25 rows=38739 width=44)
              ->  Group  (cost=99752.84..101689.80 rows=387392
width=44)
                    ->  Sort  (cost=99752.84..99752.84 rows=387392
width=44)
                          ->  Seq Scan on rainfall
(cost=0.00..35636.20 rows=387392 width=44)
  ->  Index Scan using sites_pkey on sites s  (cost=0.00..2.03 rows=1
width=56)

EXPLAIN


This is the query that keeps going and going and going....

SRDC=> explain select * from daily_rainfall as p, sites as s where
p.site = s.ident and s.latitude >= -90 and s.latitude <= 90 and
s.longitude >= -180 and s.longitude <= 180 and s.terrain in ('A');
NOTICE:  QUERY PLAN:

Nested Loop  (cost=99752.84..105094.23 rows=89 width=100)
  ->  Seq Scan on sites s  (cost=0.00..14.74 rows=1 width=56)
  ->  Subquery Scan p  (cost=99752.84..104595.25 rows=38739 width=44)
        ->  Aggregate  (cost=99752.84..104595.25 rows=38739 width=44)
              ->  Group  (cost=99752.84..101689.80 rows=387392
width=44)
                    ->  Sort  (cost=99752.84..99752.84 rows=387392
width=44)
                          ->  Seq Scan on rainfall
(cost=0.00..35636.20 rows=387392 width=44)

EXPLAIN


One difference I see is that the first query does an index scan on
table 'sites', whereas the second query does not.  The only index on
'sites' is the primary key index with on attribute 'ident'.  Would
performance improve if I added an index on 'terrain'?  Attributes
'latitude' and 'longitude' are always used as selection criteria so
should I add indexes to them as well?