Help analyzing 7.2.4 EXPLAIN

Поиск
Список
Период
Сортировка
От Josh Berkus
Тема Help analyzing 7.2.4 EXPLAIN
Дата
Msg-id 200304091715.19565.josh@agliodbs.com
обсуждение исходный текст
Ответы Re: Help analyzing 7.2.4 EXPLAIN
Список pgsql-performance
Folks,

What follows is a 7.2.4 EXPLAIN ANALYZE statement for the shown query.  This
query is currently taking 570 msec, an OK amount of time until you realize
that the system only has test data currently, and the tables in full
production will have 100-1000 times as much data.

 Becuase it's 7.2.4, it's a little hard to tell exactly which part of the
query is taking up 90% of the processing time.    The line which claims to be
taking that time is:
  ->  Seq Scan on users  (cost=0.00..3595.33 rows=12 width=87) (actual
time=13.50..547.59 rows=41 loops=1)

However, since users has only 200 records, I suspect that what's actually
being represented here is the processing time for the PL/pgSQL procedure in
the correlated subselect, if_addendee_conflict().

Questions:
1. Can anyone confirm my analysis in the paragraph above?
2. Can anyone point out any obvious ways to speed up the query below?
3. In the query below, if_attendee_conflict needs to be run once for each
(attorney times events) on the same day.   Further, if_attendee_conflict
involves a database lookup in 10-25% of cases.   Given that
if_attendee_conflict needs to apply complex criteria to determine whether or
not there is a conflict, can anyone suggest possible ways to cut down on the
number of required loops?

Thanks everyone!   Query and analyze follows.


j_test=> explain analyze
SELECT users.user_id, (users.fname || COALESCE(' ' || users.minit, '') || ' '
|| users.lname) as atty_name,
     users.lname,
     (SELECT if_addendee_conflict(users.user_id, 3272, '2003-04-15 10:00', '1
days'::INTERVAL,
         events.event_id, events.event_date, events.duration, event_cats.status, '30
minutes') as cflt
        FROM events, event_types, event_cats, event_days
        WHERE events.event_id = event_days.event_id
            and events.etype_id = event_types.etype_id
             AND event_types.ecat_id = event_cats.ecat_id
             AND event_days.event_day
                 BETWEEN '2003-04-15' AND '2003-04-16 10:00'
        ORDER BY cflt LIMIT 1)  AS conflict
FROM users
WHERE EXISTS (SELECT teams_users.user_id FROM teams_users JOIN teams_tree
    ON teams_users.team_id = teams_tree.team_id WHERE teams_tree.treeno
    BETWEEN 3 and 4 AND teams_users.user_id = users.user_id)
AND users.status > 0
    AND NOT EXISTS (SELECT staff_id FROM event_staff WHERE event_id = 3272
         AND staff_id = users.user_id)
ORDER BY conflict, users.lname, atty_name;

NOTICE:  QUERY PLAN:

Sort  (cost=3595.55..3595.55 rows=12 width=87) (actual time=547.89..547.91
rows=41 loops=1)
  ->  Seq Scan on users  (cost=0.00..3595.33 rows=12 width=87) (actual
time=13.50..547.59 rows=41 loops=1)
        SubPlan
          ->  Limit  (cost=54.03..54.03 rows=1 width=46) (actual
time=13.14..13.14 rows=1 loops=41)
                ->  Sort  (cost=54.03..54.03 rows=1 width=46) (actual
time=13.13..13.13 rows=2 loops=41)
                      ->  Hash Join  (cost=52.77..54.02 rows=1 width=46)
(actual time=5.09..12.94 rows=95 loops=41)
                            ->  Seq Scan on event_cats  (cost=0.00..1.16
rows=16 width=6) (actual time=0.01..0.05 rows=16 loops=41)
                            ->  Hash  (cost=52.77..52.77 rows=1 width=40)
(actual time=4.72..4.72 rows=0 loops=41)
                                  ->  Hash Join  (cost=49.94..52.77 rows=1
width=40) (actual time=4.19..4.59 rows=95 loops=41)
                                        ->  Seq Scan on event_types
(cost=0.00..2.54 rows=54 width=8) (actual time=0.01..0.12 rows=54 loops=41)
                                        ->  Hash  (cost=49.93..49.93 rows=5
width=32) (actual time=4.10..4.10 rows=0 loops=41)
                                              ->  Nested Loop
(cost=0.00..49.93 rows=5 width=32) (actual time=0.16..3.95 rows=95 loops=41)
                                                    ->  Seq Scan on event_days
(cost=0.00..25.00 rows=5 width=4) (actual time=0.12..2.31 rows=95 loops=41)
                                                    ->  Index Scan using
events_pkey on events  (cost=0.00..4.97 rows=1 width=28) (actual
time=0.01..0.01 rows=1 loops=3895)
          ->  Nested Loop  (cost=0.00..19.47 rows=1 width=12) (actual
time=0.04..0.04 rows=0 loops=147)
                ->  Index Scan using idx_teams_tree_node on teams_tree
(cost=0.00..8.58 rows=2 width=4) (actual time=0.01..0.02 rows=2 loops=147)
                ->  Index Scan using teams_users_pk on teams_users
(cost=0.00..4.83 rows=1 width=8) (actual time=0.01..0.01 rows=0 loops=252)
          ->  Index Scan using event_staff_table_pk on event_staff
(cost=0.00..4.95 rows=1 width=4) (actual time=0.01..0.01 rows=0 loops=41)
Total runtime: 548.20 msec

--
-Josh Berkus
 Aglio Database Solutions
 San Francisco


В списке pgsql-performance по дате отправления:

Предыдущее
От: "Jim C. Nasby"
Дата:
Сообщение: Re: choosing the right platform
Следующее
От: Tom Lane
Дата:
Сообщение: Caching (was Re: choosing the right platform)