Optimizer use of index slows down query by factor

Поиск
Список
Период
Сортировка
От Michael Ruf
Тема Optimizer use of index slows down query by factor
Дата
Msg-id 4B33368D.1060505@inxmail.de
обсуждение исходный текст
Ответы Re: Optimizer use of index slows down query by factor  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-performance
Hi,

we experience some strange performance problems, we've already found a
workaround for us, but are curious if it's a known problem of the optimizer.

Tested with the following Postgres Version: 8.2.15 and 8.3.9
AUTOVACUUM is enabled, explicit VACUUM and REINDEX both tables and the
whole DB.

effective_cache_size = 3096MB
default_statistics_target = 100
shared_buffers = 1024MB
work_mem = 64MB

Table Schema:

Table "click"
      Column      |  Type   | Modifiers
-----------------+---------+-----------
  click_id        | integer | not null
  member_id       | integer |
  link_id         | integer |
  click_timestamp | bigint  |
  remote_host     | text    |
  user_agent      | text    |
Indexes:
     "click_pkey" PRIMARY KEY, btree (click_id)
     "idx_click_1" btree (link_id)
     "idx_click_2" btree (click_timestamp)

Table "link"
    Column   |  Type   | Modifiers
------------+---------+-----------
  link_id    | integer | not null
  link_url   | text    |
  task_id    | integer |
  link_type  | integer |
  action_id  | integer |
  link_alias | text    |
  deleted    | boolean |
  deletable  | boolean |
Indexes:
     "link_pkey" PRIMARY KEY, btree (link_id)
     "idx_link_1" btree (task_id)

Rows in click table contains:    22874089
Rows in link table contains:    4220601


The following query is slow when index scan is enabled:

SELECT
link.link_alias,link.link_type,COUNT(click.click_id),COUNT(distinct
click.member_id) FROM link LEFT JOIN click ON link.link_id=click.link_id
WHERE (link.link_type=8 OR link.link_type=9) AND link.task_id=1556 AND
(link.deletable IS NULL OR link.deletable=false)GROUP BY
link.link_type,link.link_alias LIMIT 1000


Explain with index scan enabled:

explain analyze SELECT
link.link_alias,link.link_type,COUNT(click.click_id),COUNT(distinct
click.member_id) FROM link LEFT JOIN click ON link.link_id=click.link_id
WHERE (link.link_type=8 OR link.link_type=9) AND link.task_id=1556 AND
(link.deletable IS NULL OR link.deletable=false)GROUP BY
link.link_type,link.link_alias LIMIT 1000;

             QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit  (cost=1416936.47..1417849.48 rows=1 width=30) (actual
time=277062.951..277073.144 rows=12 loops=1)
    ->  GroupAggregate  (cost=1416936.47..1417849.48 rows=1 width=30)
(actual time=277062.949..277073.126 rows=12 loops=1)
          ->  Sort  (cost=1416936.47..1417119.07 rows=73040 width=30)
(actual time=277062.820..277066.219 rows=6445 loops=1)
                Sort Key: link.link_type, link.link_alias
                Sort Method:  quicksort  Memory: 696kB
                ->  Merge Right Join  (cost=1604.91..1411036.15
rows=73040 width=30) (actual time=277027.644..277050.946 rows=6445 loops=1)
                      Merge Cond: (click.link_id = link.link_id)
                      ->  Index Scan using idx_click_1 on click
(cost=0.00..1351150.42 rows=22874088 width=12) (actual
time=6.915..263327.439 rows=22409997 loops=1)
                      ->  Sort  (cost=1604.91..1638.61 rows=13477
width=26) (actual time=12.172..15.640 rows=6445 loops=1)
                            Sort Key: link.link_id
                            Sort Method:  quicksort  Memory: 33kB
                            ->  Index Scan using idx_link_1 on link
(cost=0.00..680.51 rows=13477 width=26) (actual time=5.707..12.043
rows=126 loops=1)
                                  Index Cond: (task_id = 1556)
                                  Filter: (((deletable IS NULL) OR (NOT
deletable)) AND ((link_type = 8) OR (link_type = 9)))
  Total runtime: 277082.204 ms
(15 rows)


Explain with "set enable_indexscan=false;"

explain analyze SELECT
link.link_alias,link.link_type,COUNT(click.click_id),COUNT(distinct
click.member_id) FROM link LEFT JOIN click ON link.link_id=click.link_id
WHERE (link.link_type=8 OR link.link_type=9) AND link.task_id=1556 AND
(link.deletable IS NULL OR link.deletable=false)GROUP BY
link.link_type,link.link_alias LIMIT 1000;

    QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit  (cost=2577764.28..2578677.29 rows=1 width=30) (actual
time=51713.324..51723.517 rows=12 loops=1)
    ->  GroupAggregate  (cost=2577764.28..2578677.29 rows=1 width=30)
(actual time=51713.322..51723.499 rows=12 loops=1)
          ->  Sort  (cost=2577764.28..2577946.88 rows=73040 width=30)
(actual time=51713.191..51716.600 rows=6445 loops=1)
                Sort Key: link.link_type, link.link_alias
                Sort Method:  quicksort  Memory: 696kB
                ->  Hash Left Join  (cost=1140942.18..2571863.96
rows=73040 width=30) (actual time=45276.194..51702.053 rows=6445 loops=1)
                      Hash Cond: (link.link_id = click.link_id)
                      ->  Bitmap Heap Scan on link
(cost=253.20..34058.86 rows=13477 width=26) (actual time=0.044..0.168
rows=126 loops=1)
                            Recheck Cond: (task_id = 1556)
                            Filter: (((deletable IS NULL) OR (NOT
deletable)) AND ((link_type = 8) OR (link_type = 9)))
                            ->  Bitmap Index Scan on idx_link_1
(cost=0.00..249.83 rows=13482 width=0) (actual time=0.030..0.030
rows=128 loops=1)
                                  Index Cond: (task_id = 1556)
                      ->  Hash  (cost=743072.88..743072.88 rows=22874088
width=12) (actual time=45274.316..45274.316 rows=22874089 loops=1)
                            ->  Seq Scan on click (cost=0.00..743072.88
rows=22874088 width=12) (actual time=0.024..17333.860 rows=22874089 loops=1)
  Total runtime: 51728.643 ms
(15 rows)



We can't drop the index, because all other queries on the click table
are 10-100 times faster if index is enabled.

We have worked around with following SQL to emulate the LEFT JOIN, which
returns the same result.

explain analyze SELECT
link.link_alias,link.link_type,COUNT(click.click_id),COUNT(distinct
click.member_id) FROM link JOIN click ON link.link_id=click.link_id
WHERE (link.link_type=8 OR link.link_type=9) AND link.task_id=1556 AND
(link.deletable IS NULL OR link.deletable=false)GROUP BY
link.link_type,link.link_alias
UNION SELECT link_alias,link_type,0,0 from link where (link_type=8 OR
link_type=9) AND task_id=1556 AND (deletable IS NULL OR deletable=false)
and link_alias not in ( SELECT link.link_alias FROM link JOIN click ON
link.link_id=click.link_id WHERE (link.link_type=8 OR link.link_type=9)
AND link.task_id=1556 AND (link.deletable IS NULL OR
link.deletable=false)) GROUP BY link_type,link_alias LIMIT 1000;

                   QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit  (cost=2011715.37..2011715.40 rows=2 width=30) (actual
time=56449.978..56450.016 rows=12 loops=1)
    ->  Unique  (cost=2011715.37..2011715.40 rows=2 width=30) (actual
time=56449.974..56449.995 rows=12 loops=1)
          ->  Sort  (cost=2011715.37..2011715.38 rows=2 width=30)
(actual time=56449.972..56449.978 rows=12 loops=1)
                Sort Key: link.link_alias, link.link_type,
(count(click.click_id)), (count(DISTINCT click.member_id))
                Sort Method:  quicksort  Memory: 25kB
                ->  Append  (cost=1007886.06..2011715.36 rows=2
width=30) (actual time=28207.665..56449.932 rows=12 loops=1)
                      ->  GroupAggregate  (cost=1007886.06..1008799.08
rows=1 width=30) (actual time=28207.664..28217.739 rows=11 loops=1)
                            ->  Sort  (cost=1007886.06..1008068.66
rows=73040 width=30) (actual time=28207.562..28210.936 rows=6369 loops=1)
                                  Sort Key: link.link_type, link.link_alias
                                  Sort Method:  quicksort  Memory: 690kB
                                  ->  Hash Join
(cost=848.97..1001985.74 rows=73040 width=30) (actual
time=11933.222..28196.805 rows=6369 loops=1)
                                        Hash Cond: (click.link_id =
link.link_id)
                                        ->  Seq Scan on click
(cost=0.00..743072.88 rows=22874088 width=12) (actual
time=0.030..14572.729 rows=22874089 loops=1)
                                        ->  Hash  (cost=680.51..680.51
rows=13477 width=26) (actual time=0.248..0.248 rows=126 loops=1)
                                              ->  Index Scan using
idx_link_1 on link  (cost=0.00..680.51 rows=13477 width=26) (actual
time=0.025..0.143 rows=126 loops=1)
                                                    Index Cond: (task_id
= 1556)
                                                    Filter: (((deletable
IS NULL) OR (NOT deletable)) AND ((link_type = 8) OR (link_type = 9)))
                      ->  Subquery Scan "*SELECT* 2"
(cost=1002916.26..1002916.28 rows=1 width=22) (actual
time=28232.176..28232.178 rows=1 loops=1)
                            ->  HashAggregate
(cost=1002916.26..1002916.27 rows=1 width=22) (actual
time=28232.161..28232.162 rows=1 loops=1)
                                  ->  Index Scan using idx_link_1 on
link  (cost=1002168.34..1002882.56 rows=6739 width=22) (actual
time=28232.077..28232.147 rows=1 loops=1)
                                        Index Cond: (task_id = 1556)
                                        Filter: (((deletable IS NULL) OR
(NOT deletable)) AND (NOT (hashed subplan)) AND ((link_type = 8) OR
(link_type = 9)))
                                        SubPlan
                                          ->  Hash Join
(cost=848.97..1001985.74 rows=73040 width=18) (actual
time=11931.673..28226.561 rows=6369 loops=1)
                                                Hash Cond:
(click.link_id = link.link_id)
                                                ->  Seq Scan on click
(cost=0.00..743072.88 rows=22874088 width=4) (actual
time=0.022..14581.208 rows=22874089 loops=1)
                                                ->  Hash
(cost=680.51..680.51 rows=13477 width=22) (actual time=0.240..0.240
rows=126 loops=1)
                                                      ->  Index Scan
using idx_link_1 on link  (cost=0.00..680.51 rows=13477 width=22)
(actual time=0.015..0.131 rows=126 loops=1)
                                                            Index Cond:
(task_id = 1556)
                                                            Filter:
(((deletable IS NULL) OR (NOT deletable)) AND ((link_type = 8) OR
(link_type = 9)))
  Total runtime: 56450.254 ms
(31 rows)


Ciao,
Michael





В списке pgsql-performance по дате отправления:

Предыдущее
От: Craig Ringer
Дата:
Сообщение: Re: FSM - per database or per installation?
Следующее
От: Lucas Maystre
Дата:
Сообщение: Multicolumn index - WHERE ... ORDER BY