Re: Very slow queries - please help

Поиск
Список
Период
Сортировка
От Bealach-na Bo
Тема Re: Very slow queries - please help
Дата
Msg-id BAY101-F35C5475BF7CCEE56EAB208AD540@phx.gbl
обсуждение исходный текст
Ответы Re: Very slow queries - please help  ("Thomas F. O'Connell" <tfo@sitening.com>)
Список pgsql-performance
OK.

The consensus seems to be that I need more indexes and I also need to
look into the NOT IN statement as a possible bottleneck. I've
introduced the indexes which has led to a DRAMATIC change in response
time. Now I have to experiment with INNER JOIN -> OUTER JOIN
variations, SET ENABLE_SEQSCAN=OFF.

Forgive me for not mentioning each person individually and by name.
You have all contributed to confirming what I had suspected (and
hoped): that *I* have a lot to learn!

I'm attaching table descriptions, the first few lines of top output
while the queries were running, index lists, sample queries and
EXPLAIN ANALYSE output BEFORE and AFTER the introduction of the
indexes. As I said, DRAMATIC :) I notice that the CPU usage does not
vary very much, it's nearly 100% anyway, but the memory usage drops
markedly, which is another very nice result of the index introduction.

Any more comments and tips would be very welcome.

Thank you all for your input.

Bealach.




blouser@10.0.0.2.dbdev=> \d job_log
                                     Table "blouser.job_log"
     Column     |            Type             |                    Modifiers
----------------+-----------------------------+--------------------------------------------------
job_log_id     | integer                     | not null default
nextval('job_log_id_seq'::text)
first_registry | timestamp without time zone |
blogger_name   | character(50)               |
node_id        | integer                     |
job_type       | character(50)               |
job_name       | character(256)              |
job_start      | timestamp without time zone |
job_timeout    | interval                    |
job_stop       | timestamp without time zone |
nfiles_in_job  | integer                     |
status         | integer                     |
error_code     | smallint                    |
Indexes:
    "job_log_id_pkey" PRIMARY KEY, btree (job_log_id)
Check constraints:
    "job_log_status_check" CHECK (status = 0 OR status = 1 OR status = 8 OR
status = 9)
Foreign-key constraints:
    "legal_node" FOREIGN KEY (node_id) REFERENCES node(node_id)





blouser@10.0.0.2.dbdev=> \d node
                          Table "blouser.node"
Column  |     Type      |                   Modifiers
---------+---------------+-----------------------------------------------
node_id | integer       | not null default nextval('node_id_seq'::text)
name    | character(50) |
type    | character(1)  |
usage   | character(4)  |
Indexes:
    "node_id_pkey" PRIMARY KEY, btree (node_id)
    "node_name_key" UNIQUE, btree (name)
Check constraints:
    "node_type_check" CHECK ("type" = 'B'::bpchar OR "type" = 'K'::bpchar OR
"type" = 'C'::bpchar OR "type" = 'T'::bpchar OR "type" = 'R'::bpchar)
    "node_usage_check" CHECK ("usage" = 'TEST'::bpchar OR "usage" =
'LIVE'::bpchar)


#========================before new indexes were created


Tasks: 114 total,   2 running, 112 sleeping,   0 stopped,   0 zombie
Cpu(s): 25.7% us, 24.5% sy,  0.0% ni, 49.4% id,  0.3% wa,  0.0% hi,  0.0% si
Mem:   1554788k total,  1513576k used,    41212k free,    31968k buffers
Swap:  1020024k total,    27916k used,   992108k free,   708728k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
25883 postgres  25   0 20528  12m  11m R 99.7  0.8   4:54.91 postmaster





blouser@10.0.0.2.dbdev=> \di
                   List of relations
Schema  |      Name       | Type  |  Owner  |  Table
---------+-----------------+-------+---------+---------
blouser | job_log_id_pkey | index | blouser | job_log
blouser | node_id_pkey    | index | blouser | node
blouser | node_name_key   | index | blouser | node
(3 rows)


      EXPLAIN ANALYSE SELECT n.name,n.type,
             n.usage, j.status,
             j.job_start,j.job_stop,
             j.nfiles_in_job,j.job_name
      FROM job_log j
      INNER JOIN node n
      ON j.node_id = n.node_id
      WHERE n.name
      LIKE '711%'
      AND n.type = 'K'
      AND n.usage = 'LIVE'
      AND j.job_name = 'COPY FILES'
      AND j.job_start >= '2005-11-14 00:00:00'
      AND (j.job_stop <= '2005-11-14 05:00:00' OR j.job_stop IS NULL)
      ORDER BY n.name;


                                  QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------
Nested Loop  (cost=0.00..75753.31 rows=1 width=461) (actual
time=270486.692..291662.350 rows=3 loops=1)
   Join Filter: ("inner".node_id = "outer".node_id)
   ->  Index Scan using node_name_key on node n  (cost=0.00..307.75 rows=1
width=181) (actual time=0.135..11.034 rows=208 loops=1)
         Filter: ((name ~~ '711%'::text) AND ("type" = 'K'::bpchar) AND
("usage" = 'LIVE'::bpchar))
   ->  Seq Scan on job_log j  (cost=0.00..75445.54 rows=1 width=288) (actual
time=273.374..1402.089 rows=22 loops=208)
         Filter: ((job_name = 'COPY FILES'::bpchar) AND (job_start >=
'2005-11-14 00:00:00'::timestamp without time zone) AND ((job_stop <=
'2005-11-14 05:00:00'::timestamp without time zone) OR (job_stop IS NULL)))
Total runtime: 291662.482 ms
(7 rows)


      EXPLAIN ANALYSE SELECT n.name, n.type, n.usage
      FROM node n
      WHERE n.name
      LIKE '56x%'
      AND n.type  = 'K'
      AND n.usage = 'LIVE'
      AND n.node_id NOT IN
      (SELECT n.node_id
      FROM job_log j
      INNER JOIN node n
      ON j.node_id = n.node_id
      WHERE n.name
      LIKE '711%'
      AND n.type  = 'K'
      AND n.usage = 'LIVE'
      AND j.job_name = 'COPY FILES'
      AND j.job_start >= '2005-11-14 00:00:00'
      AND (j.job_stop <= '2005-11-14 05:00:00' OR j.job_stop IS NULL))
      ORDER BY n.name;



                                      QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------
Index Scan using node_name_key on node n  (cost=75451.55..75764.94 rows=1
width=177) (actual time=1394.617..1398.609 rows=205 loops=1)
   Filter: ((name ~~ '56x%'::text) AND ("type" = 'K'::bpchar) AND ("usage" =
'LIVE'::bpchar) AND (NOT (hashed subplan)))
   SubPlan
     ->  Nested Loop  (cost=0.00..75451.54 rows=1 width=4) (actual
time=1206.622..1394.462 rows=3 loops=1)
           ->  Seq Scan on job_log j  (cost=0.00..75445.54 rows=1 width=4)
(actual time=271.361..1393.363 rows=22 loops=1)
                 Filter: ((job_name = 'COPY FILES'::bpchar) AND (job_start
 >= '2005-11-14 00:00:00'::timestamp without time zone) AND ((job_stop <=
'2005-11-14 05:00:00'::timestamp without time zone) OR (job_stop IS NULL)))
           ->  Index Scan using node_id_pkey on node n  (cost=0.00..5.99
rows=1 width=4) (actual time=0.042..0.042 rows=0 loops=22)
                 Index Cond: ("outer".node_id = n.node_id)
                 Filter: ((name ~~ '711%'::text) AND ("type" = 'K'::bpchar)
AND ("usage" = 'LIVE'::bpchar))
Total runtime: 1398.868 ms
(10 rows)




#===================================after the new indexes were created

Tasks: 114 total,   2 running, 112 sleeping,   0 stopped,   0 zombie
Cpu(s): 22.9% us, 27.2% sy,  0.0% ni, 49.7% id,  0.0% wa,  0.2% hi,  0.0% si
Mem:   1554788k total,  1414632k used,   140156k free,    14784k buffers
Swap:  1020024k total,    28008k used,   992016k free,   623652k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
26409 postgres  25   0 21580 8684 7116 R 99.9  0.6   0:25.38 postmaster



Schema  |        Name        | Type  |  Owner  |  Table
---------+--------------------+-------+---------+---------
blouser | idx_job_log_filter | index | blouser | job_log
blouser | idx_node_filter    | index | blouser | node
blouser | job_log_id_pkey    | index | blouser | job_log
blouser | node_id_pkey       | index | blouser | node
blouser | node_name_key      | index | blouser | node
(5 rows)


      EXPLAIN ANALYSE SELECT n.name,n.type,
             n.usage, j.status,
             j.job_start,j.job_stop,
             j.nfiles_in_job,j.job_name
      FROM job_log j
      INNER JOIN node n
      ON j.node_id = n.node_id
      WHERE n.name
      LIKE '711%'
      AND n.type = 'K'
      AND n.usage = 'LIVE'
      AND j.job_name = 'COPY FILES'
      AND j.job_start >= '2005-11-14 00:00:00'
      AND (j.job_stop <= '2005-11-14 05:00:00' OR j.job_stop IS NULL)
      ORDER BY n.name;



----------------------------------------------------------------------------------------------------------------------------------------------------
Sort  (cost=18049.23..18049.23 rows=1 width=461) (actual
time=223.540..223.543 rows=3 loops=1)
   Sort Key: n.name
   ->  Nested Loop  (cost=0.00..18049.22 rows=1 width=461) (actual
time=201.575..223.470 rows=3 loops=1)
         ->  Index Scan using idx_job_log_filter on job_log j
(cost=0.00..18043.21 rows=1 width=288) (actual time=52.567..222.855 rows=22
loops=1)
               Index Cond: ((job_name = 'COPY FILES'::bpchar) AND (job_start
 >= '2005-11-14 00:00:00'::timestamp without time zone))
               Filter: ((job_stop <= '2005-11-14 05:00:00'::timestamp
without time zone) OR (job_stop IS NULL))
         ->  Index Scan using node_id_pkey on node n  (cost=0.00..5.99
rows=1 width=181) (actual time=0.022..0.022 rows=0 loops=22)
               Index Cond: ("outer".node_id = n.node_id)
               Filter: ((name ~~ '711%'::text) AND ("type" = 'K'::bpchar)
AND ("usage" = 'LIVE'::bpchar))
Total runtime: 223.677 ms
(10 rows)



      EXPLAIN ANALYSE SELECT n.name, n.type, n.usage
      FROM node n
      WHERE n.name
      LIKE '56x%'
      AND n.type  = 'K'
      AND n.usage = 'LIVE'
      AND n.node_id NOT IN
      (SELECT n.node_id
      FROM job_log j
      INNER JOIN node n
      ON j.node_id = n.node_id
      WHERE n.name
      LIKE '711%'
      AND n.type  = 'K'
      AND n.usage = 'LIVE'
      AND j.job_name = 'COPY FILES'
      AND j.job_start >= '2005-11-14 00:00:00'
      AND (j.job_stop <= '2005-11-14 05:00:00' OR j.job_stop IS NULL))
      ORDER BY n.name;



QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------------
Sort  (cost=18141.89..18141.89 rows=1 width=177) (actual
time=223.495..223.627 rows=205 loops=1)
   Sort Key: name
   ->  Seq Scan on node n  (cost=18049.22..18141.88 rows=1 width=177)
(actual time=220.293..222.526 rows=205 loops=1)
         Filter: ((name ~~ '56x%'::text) AND ("type" = 'K'::bpchar) AND
("usage" = 'LIVE'::bpchar) AND (NOT (hashed subplan)))
         SubPlan
           ->  Nested Loop  (cost=0.00..18049.22 rows=1 width=4) (actual
time=198.343..220.195 rows=3 loops=1)
                 ->  Index Scan using idx_job_log_filter on job_log j
(cost=0.00..18043.21 rows=1 width=4) (actual time=50.748..219.741 rows=22
loops=1)
                       Index Cond: ((job_name = 'COPY FILES'::bpchar) AND
(job_start >= '2005-11-14 00:00:00'::timestamp without time zone))
                       Filter: ((job_stop <= '2005-11-14
05:00:00'::timestamp without time zone) OR (job_stop IS NULL))
                 ->  Index Scan using node_id_pkey on node n
(cost=0.00..5.99 rows=1 width=4) (actual time=0.015..0.016 rows=0 loops=22)
                       Index Cond: ("outer".node_id = n.node_id)
                       Filter: ((name ~~ '711%'::text) AND ("type" =
'K'::bpchar) AND ("usage" = 'LIVE'::bpchar))
Total runtime: 223.860 ms
(13 rows)



В списке pgsql-performance по дате отправления:

Предыдущее
От: "Luke Lonergan"
Дата:
Сообщение: Re: Hardware/OS recommendations for large databases (
Следующее
От: "Bealach-na Bo"
Дата:
Сообщение: Very slow queries - please help