Обсуждение: Slow query

Поиск
Список
Период
Сортировка

Slow query

От
Ivan Voras
Дата:
I have a query that I think should run faster. The machine is P2/400
with enough ram (384MB), but still, maybe the query could be tuned up.
postgresql.conf is stock with these values changed:

fsync=false
shared_buffers = 5000
sort_mem = 8192
vacuum_mem = 16384

This is a development machine, the production will be dual P3, 1GHz, 1GB
RAM, but I fear that the execution will still be slow, as the tables
will get bigger.

I've pasted information about the database, and the explain output, but
the text is horribly wrapped so there's a clean copy on the web in
http://geri.cc.fer.hr/~ivoras/query.txt

The intention is: there is a table called cl_log which records events
from various sources, some of which also have data in data_kat_id and
data_user_id fields, some of which don't (hence the outer joins). The
query is report-style, and tries to collect as much data as possible
about the events. Tables cl_source, cl_handler and cl_event_type hold
information about the type of event. They are small (currently 1-3
records in each, will grow to about 10 records).



ferweb=> explain analyze SELECT cl_log.*, cl_source.name AS source_name,
cl_source.description AS source_description,
    cl_handler.name AS handler_name, cl_handler.description AS
handler_description, cl_event_type.name AS event_type_name,
    cl_event_type.description as event_type_description, users.jime,
kategorija.knaziv
    FROM cl_log
        INNER JOIN cl_source ON source_id=cl_source.id
        INNER JOIN cl_handler ON cl_source.handler_id=cl_handler.id
        INNER JOIN cl_event_type ON event_type_id=cl_event_type.id
        LEFT OUTER JOIN kategorija ON data_kat_id=kategorija.id
        LEFT OUTER JOIN users ON data_user_id=users.id
        ORDER BY time desc LIMIT 30;

             QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit  (cost=788.78..788.79 rows=2 width=500) (actual
time=23229.78..23230.44 rows=30 loops=1)
    ->  Sort  (cost=788.78..788.79 rows=3 width=500) (actual
time=23229.75..23230.10 rows=31 loops=1)
          Sort Key: cl_log."time"
          ->  Nested Loop  (cost=1.04..788.76 rows=3 width=500) (actual
time=4078.85..20185.89 rows=38999 loops=1)
                ->  Nested Loop  (cost=1.04..771.27 rows=3 width=485)
(actual time=4078.71..14673.27 rows=38999 loops=1)
                      ->  Hash Join  (cost=1.04..754.21 rows=3
width=417) (actual time=4078.54..8974.08 rows=38999 loops=1)
                            Hash Cond: ("outer".event_type_id = "inner".id)
                            ->  Nested Loop  (cost=0.00..752.16 rows=195
width=288) (actual time=4078.20..6702.17 rows=38999 loops=1)
                                  Join Filter: ("inner".handler_id =
"outer".id)
                                  ->  Seq Scan on cl_handler
(cost=0.00..1.01 rows=1 width=104) (actual time=0.02..0.04 rows=1 loops=1)
                                  ->  Materialize  (cost=748.72..748.72
rows=195 width=184) (actual time=4078.08..4751.52 rows=38999 loops=1)
                                        ->  Nested Loop
(cost=0.00..748.72 rows=195 width=184) (actual time=0.21..3197.16
rows=38999 loops=1)
                                              ->  Seq Scan on cl_source
  (cost=0.00..1.01 rows=1 width=108) (actual time=0.05..0.06 rows=1 loops=1)
                                              ->  Index Scan using
cl_log_source on cl_log  (cost=0.00..745.27 rows=195 width=76) (actual
time=0.11..1467.08 rows=38999 loops=1)
                                                    Index Cond:
(cl_log.source_id = "outer".id)
                            ->  Hash  (cost=1.03..1.03 rows=3 width=129)
(actual time=0.12..0.12 rows=0 loops=1)
                                  ->  Seq Scan on cl_event_type
(cost=0.00..1.03 rows=3 width=129) (actual time=0.04..0.08 rows=3 loops=1)
                      ->  Index Scan using kategorija_pkey on kategorija
  (cost=0.00..5.82 rows=1 width=68) (actual time=0.05..0.07 rows=1
loops=38999)
                            Index Cond: ("outer".data_kat_id =
kategorija.id)
                ->  Index Scan using users_pkey on users
(cost=0.00..5.97 rows=1 width=15) (actual time=0.05..0.07 rows=1
loops=38999)
                      Index Cond: ("outer".data_user_id = users.id)
  Total runtime: 23267.25 msec
(22 rows)

ferweb=> select count(*) from cl_log;
  count
-------
  38999
(1 row)

ferweb=> select count(*) from cl_handler;
  count
-------
      1
(1 row)

ferweb=> select count(*) from cl_source;
  count
-------
      1
(1 row)

ferweb=> select count(*) from cl_event_type;
  count
-------
      3
(1 row)

ferweb=> select count(*) from users;
  count
-------
   2636
(1 row)

ferweb=> select count(*) from kategorija;
  count
-------
   1928
(1 row)



--
Every sufficiently advanced magic is indistinguishable from technology
    - Arthur C Anticlarke


Re: Slow query

От
"scott.marlowe"
Дата:
On Tue, 24 Feb 2004, Ivan Voras wrote:

>           ->  Nested Loop  (cost=1.04..788.76 rows=3 width=500) (actual
> time=4078.85..20185.89 rows=38999 loops=1)
>                 ->  Nested Loop  (cost=1.04..771.27 rows=3 width=485)
> (actual time=4078.71..14673.27 rows=38999 loops=1)
>                             ->  Nested Loop  (cost=0.00..752.16 rows=195
> width=288) (actual time=4078.20..6702.17 rows=38999 loops=1)
>                                         ->  Nested Loop
> (cost=0.00..748.72 rows=195 width=184) (actual time=0.21..3197.16
> rows=38999 loops=1)

Note those nested loops up there.  They think that you are going to be
operating on 3,3,195, and 195 rows respectively, when they actually are
operating on 38999, 38999, 38999, and 38999 in reality.

set enable_nestloop = off

and see if that helps.  If so, see if altering the responsible columns
default stats to something higher (100 is a good start) and reanalyze to
see if you get a better plan.  As long as those estimates are that far
off, you're gonna get a poorly performing query when the planner is
allowed to use nested loops.