Обсуждение: Optimizer use of index slows down query by factor

Поиск
Список
Период
Сортировка

Optimizer use of index slows down query by factor

От
Michael Ruf
Дата:
Hi,

we experience some strange performance problems, we've already found a
workaround for us, but are curious if it's a known problem of the optimizer.

Tested with the following Postgres Version: 8.2.15 and 8.3.9
AUTOVACUUM is enabled, explicit VACUUM and REINDEX both tables and the
whole DB.

effective_cache_size = 3096MB
default_statistics_target = 100
shared_buffers = 1024MB
work_mem = 64MB

Table Schema:

Table "click"
      Column      |  Type   | Modifiers
-----------------+---------+-----------
  click_id        | integer | not null
  member_id       | integer |
  link_id         | integer |
  click_timestamp | bigint  |
  remote_host     | text    |
  user_agent      | text    |
Indexes:
     "click_pkey" PRIMARY KEY, btree (click_id)
     "idx_click_1" btree (link_id)
     "idx_click_2" btree (click_timestamp)

Table "link"
    Column   |  Type   | Modifiers
------------+---------+-----------
  link_id    | integer | not null
  link_url   | text    |
  task_id    | integer |
  link_type  | integer |
  action_id  | integer |
  link_alias | text    |
  deleted    | boolean |
  deletable  | boolean |
Indexes:
     "link_pkey" PRIMARY KEY, btree (link_id)
     "idx_link_1" btree (task_id)

Rows in click table contains:    22874089
Rows in link table contains:    4220601


The following query is slow when index scan is enabled:

SELECT
link.link_alias,link.link_type,COUNT(click.click_id),COUNT(distinct
click.member_id) FROM link LEFT JOIN click ON link.link_id=click.link_id
WHERE (link.link_type=8 OR link.link_type=9) AND link.task_id=1556 AND
(link.deletable IS NULL OR link.deletable=false)GROUP BY
link.link_type,link.link_alias LIMIT 1000


Explain with index scan enabled:

explain analyze SELECT
link.link_alias,link.link_type,COUNT(click.click_id),COUNT(distinct
click.member_id) FROM link LEFT JOIN click ON link.link_id=click.link_id
WHERE (link.link_type=8 OR link.link_type=9) AND link.task_id=1556 AND
(link.deletable IS NULL OR link.deletable=false)GROUP BY
link.link_type,link.link_alias LIMIT 1000;

             QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit  (cost=1416936.47..1417849.48 rows=1 width=30) (actual
time=277062.951..277073.144 rows=12 loops=1)
    ->  GroupAggregate  (cost=1416936.47..1417849.48 rows=1 width=30)
(actual time=277062.949..277073.126 rows=12 loops=1)
          ->  Sort  (cost=1416936.47..1417119.07 rows=73040 width=30)
(actual time=277062.820..277066.219 rows=6445 loops=1)
                Sort Key: link.link_type, link.link_alias
                Sort Method:  quicksort  Memory: 696kB
                ->  Merge Right Join  (cost=1604.91..1411036.15
rows=73040 width=30) (actual time=277027.644..277050.946 rows=6445 loops=1)
                      Merge Cond: (click.link_id = link.link_id)
                      ->  Index Scan using idx_click_1 on click
(cost=0.00..1351150.42 rows=22874088 width=12) (actual
time=6.915..263327.439 rows=22409997 loops=1)
                      ->  Sort  (cost=1604.91..1638.61 rows=13477
width=26) (actual time=12.172..15.640 rows=6445 loops=1)
                            Sort Key: link.link_id
                            Sort Method:  quicksort  Memory: 33kB
                            ->  Index Scan using idx_link_1 on link
(cost=0.00..680.51 rows=13477 width=26) (actual time=5.707..12.043
rows=126 loops=1)
                                  Index Cond: (task_id = 1556)
                                  Filter: (((deletable IS NULL) OR (NOT
deletable)) AND ((link_type = 8) OR (link_type = 9)))
  Total runtime: 277082.204 ms
(15 rows)


Explain with "set enable_indexscan=false;"

explain analyze SELECT
link.link_alias,link.link_type,COUNT(click.click_id),COUNT(distinct
click.member_id) FROM link LEFT JOIN click ON link.link_id=click.link_id
WHERE (link.link_type=8 OR link.link_type=9) AND link.task_id=1556 AND
(link.deletable IS NULL OR link.deletable=false)GROUP BY
link.link_type,link.link_alias LIMIT 1000;

    QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit  (cost=2577764.28..2578677.29 rows=1 width=30) (actual
time=51713.324..51723.517 rows=12 loops=1)
    ->  GroupAggregate  (cost=2577764.28..2578677.29 rows=1 width=30)
(actual time=51713.322..51723.499 rows=12 loops=1)
          ->  Sort  (cost=2577764.28..2577946.88 rows=73040 width=30)
(actual time=51713.191..51716.600 rows=6445 loops=1)
                Sort Key: link.link_type, link.link_alias
                Sort Method:  quicksort  Memory: 696kB
                ->  Hash Left Join  (cost=1140942.18..2571863.96
rows=73040 width=30) (actual time=45276.194..51702.053 rows=6445 loops=1)
                      Hash Cond: (link.link_id = click.link_id)
                      ->  Bitmap Heap Scan on link
(cost=253.20..34058.86 rows=13477 width=26) (actual time=0.044..0.168
rows=126 loops=1)
                            Recheck Cond: (task_id = 1556)
                            Filter: (((deletable IS NULL) OR (NOT
deletable)) AND ((link_type = 8) OR (link_type = 9)))
                            ->  Bitmap Index Scan on idx_link_1
(cost=0.00..249.83 rows=13482 width=0) (actual time=0.030..0.030
rows=128 loops=1)
                                  Index Cond: (task_id = 1556)
                      ->  Hash  (cost=743072.88..743072.88 rows=22874088
width=12) (actual time=45274.316..45274.316 rows=22874089 loops=1)
                            ->  Seq Scan on click (cost=0.00..743072.88
rows=22874088 width=12) (actual time=0.024..17333.860 rows=22874089 loops=1)
  Total runtime: 51728.643 ms
(15 rows)



We can't drop the index, because all other queries on the click table
are 10-100 times faster if index is enabled.

We have worked around with following SQL to emulate the LEFT JOIN, which
returns the same result.

explain analyze SELECT
link.link_alias,link.link_type,COUNT(click.click_id),COUNT(distinct
click.member_id) FROM link JOIN click ON link.link_id=click.link_id
WHERE (link.link_type=8 OR link.link_type=9) AND link.task_id=1556 AND
(link.deletable IS NULL OR link.deletable=false)GROUP BY
link.link_type,link.link_alias
UNION SELECT link_alias,link_type,0,0 from link where (link_type=8 OR
link_type=9) AND task_id=1556 AND (deletable IS NULL OR deletable=false)
and link_alias not in ( SELECT link.link_alias FROM link JOIN click ON
link.link_id=click.link_id WHERE (link.link_type=8 OR link.link_type=9)
AND link.task_id=1556 AND (link.deletable IS NULL OR
link.deletable=false)) GROUP BY link_type,link_alias LIMIT 1000;

                   QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit  (cost=2011715.37..2011715.40 rows=2 width=30) (actual
time=56449.978..56450.016 rows=12 loops=1)
    ->  Unique  (cost=2011715.37..2011715.40 rows=2 width=30) (actual
time=56449.974..56449.995 rows=12 loops=1)
          ->  Sort  (cost=2011715.37..2011715.38 rows=2 width=30)
(actual time=56449.972..56449.978 rows=12 loops=1)
                Sort Key: link.link_alias, link.link_type,
(count(click.click_id)), (count(DISTINCT click.member_id))
                Sort Method:  quicksort  Memory: 25kB
                ->  Append  (cost=1007886.06..2011715.36 rows=2
width=30) (actual time=28207.665..56449.932 rows=12 loops=1)
                      ->  GroupAggregate  (cost=1007886.06..1008799.08
rows=1 width=30) (actual time=28207.664..28217.739 rows=11 loops=1)
                            ->  Sort  (cost=1007886.06..1008068.66
rows=73040 width=30) (actual time=28207.562..28210.936 rows=6369 loops=1)
                                  Sort Key: link.link_type, link.link_alias
                                  Sort Method:  quicksort  Memory: 690kB
                                  ->  Hash Join
(cost=848.97..1001985.74 rows=73040 width=30) (actual
time=11933.222..28196.805 rows=6369 loops=1)
                                        Hash Cond: (click.link_id =
link.link_id)
                                        ->  Seq Scan on click
(cost=0.00..743072.88 rows=22874088 width=12) (actual
time=0.030..14572.729 rows=22874089 loops=1)
                                        ->  Hash  (cost=680.51..680.51
rows=13477 width=26) (actual time=0.248..0.248 rows=126 loops=1)
                                              ->  Index Scan using
idx_link_1 on link  (cost=0.00..680.51 rows=13477 width=26) (actual
time=0.025..0.143 rows=126 loops=1)
                                                    Index Cond: (task_id
= 1556)
                                                    Filter: (((deletable
IS NULL) OR (NOT deletable)) AND ((link_type = 8) OR (link_type = 9)))
                      ->  Subquery Scan "*SELECT* 2"
(cost=1002916.26..1002916.28 rows=1 width=22) (actual
time=28232.176..28232.178 rows=1 loops=1)
                            ->  HashAggregate
(cost=1002916.26..1002916.27 rows=1 width=22) (actual
time=28232.161..28232.162 rows=1 loops=1)
                                  ->  Index Scan using idx_link_1 on
link  (cost=1002168.34..1002882.56 rows=6739 width=22) (actual
time=28232.077..28232.147 rows=1 loops=1)
                                        Index Cond: (task_id = 1556)
                                        Filter: (((deletable IS NULL) OR
(NOT deletable)) AND (NOT (hashed subplan)) AND ((link_type = 8) OR
(link_type = 9)))
                                        SubPlan
                                          ->  Hash Join
(cost=848.97..1001985.74 rows=73040 width=18) (actual
time=11931.673..28226.561 rows=6369 loops=1)
                                                Hash Cond:
(click.link_id = link.link_id)
                                                ->  Seq Scan on click
(cost=0.00..743072.88 rows=22874088 width=4) (actual
time=0.022..14581.208 rows=22874089 loops=1)
                                                ->  Hash
(cost=680.51..680.51 rows=13477 width=22) (actual time=0.240..0.240
rows=126 loops=1)
                                                      ->  Index Scan
using idx_link_1 on link  (cost=0.00..680.51 rows=13477 width=22)
(actual time=0.015..0.131 rows=126 loops=1)
                                                            Index Cond:
(task_id = 1556)
                                                            Filter:
(((deletable IS NULL) OR (NOT deletable)) AND ((link_type = 8) OR
(link_type = 9)))
  Total runtime: 56450.254 ms
(31 rows)


Ciao,
Michael





Re: Optimizer use of index slows down query by factor

От
Tom Lane
Дата:
Michael Ruf <mrf@inxmail.de> writes:
> we experience some strange performance problems, we've already found a
> workaround for us, but are curious if it's a known problem of the optimizer.

I think you need to see about getting this rowcount estimate to be more
accurate:

>                             ->  Index Scan using idx_link_1 on link
> (cost=0.00..680.51 rows=13477 width=26) (actual time=5.707..12.043
> rows=126 loops=1)
>                                   Index Cond: (task_id = 1556)
>                                   Filter: (((deletable IS NULL) OR (NOT
> deletable)) AND ((link_type = 8) OR (link_type = 9)))

If it realized there'd be only 126 rows out of that scan, it'd probably
have gone for a nestloop join against the big table, which I think would
be noticeably faster than either of the plans you show here.

You already did crank up default_statistics_target, so I'm not sure if
raising it further would help any.  What I'd suggest is trying to avoid
using non-independent AND/OR conditions.  For instance recasting the
first OR as just "deletable is not true" would probably result in a
better estimate.  The size of the error seems to be more than that would
account for though, so I suspect that the deletable and link_type
conditions are interdependent.  Is it practical to recast your data
representation to avoid that?

            regards, tom lane

Re: Optimizer use of index slows down query by factor

От
Michael Ruf
Дата:
Hi,

Tom Lane wrote:
 >
 > I think you need to see about getting this rowcount estimate to be more
 > accurate:
 >
 >>                             ->  Index Scan using idx_link_1 on link
 >> (cost=0.00..680.51 rows=13477 width=26) (actual time=5.707..12.043
 >> rows=126 loops=1)
 >>                                   Index Cond: (task_id = 1556)
 >>                                   Filter: (((deletable IS NULL) OR (NOT
 >> deletable)) AND ((link_type = 8) OR (link_type = 9)))
 >
 > If it realized there'd be only 126 rows out of that scan, it'd probably
 > have gone for a nestloop join against the big table, which I think would
 > be noticeably faster than either of the plans you show here.
 >
 > You already did crank up default_statistics_target, so I'm not sure if
 > raising it further would help any.

After i've increased the statistic target for the specific column on the
link table "alter table link alter task_id set statistics 200;", the sql
runs fine ( < 1 second ):

Limit  (cost=448478.40..448492.17 rows=1 width=30) (actual
time=850.698..860.838 rows=12 loops=1)
    ->  GroupAggregate  (cost=448478.40..448492.17 rows=1 width=30)
(actual time=850.695..860.824 rows=12 loops=1)
          ->  Sort  (cost=448478.40..448481.15 rows=1100 width=30)
(actual time=850.569..853.985 rows=6445 loops=1)
                Sort Key: link.link_type, link.link_alias
                Sort Method:  quicksort  Memory: 696kB
                ->  Nested Loop Left Join  (cost=0.00..448422.84
rows=1100 width=30) (actual time=819.519..838.422 rows=6445 loops=1)
                      ->  Seq Scan on link  (cost=0.00..142722.52
rows=203 width=26) (actual time=819.486..820.016 rows=126 loops=1)
                            Filter: (((deletable IS NULL) OR (NOT
deletable)) AND (task_id = 1556) AND ((link_type = 8) OR (link_type = 9)))
                      ->  Index Scan using idx_click_1 on click
(cost=0.00..1370.01 rows=10872 width=12) (actual time=0.003..0.088
rows=51 loops=126)
                            Index Cond: (link.link_id = click.link_id)
  Total runtime: 860.929 ms


 > What I'd suggest is trying to avoid
 > using non-independent AND/OR conditions.  For instance recasting the
 > first OR as just "deletable is not true" would probably result in a
 > better estimate.  The size of the error seems to be more than that would
 > account for though, so I suspect that the deletable and link_type
 > conditions are interdependent.  Is it practical to recast your data
 > representation to avoid that?
 >

I've tried that, but with no positive/negative effects.

Thanks for your help.

Michael