problem with wrong query planning and ineffective statistics

Поиск

Список

Период

Сортировка

От	Paolo Negri
Тема	problem with wrong query planning and ineffective statistics
Дата	6 марта 2007 г. 22:26:06
Msg-id	b242f69c0703061525p14db2339rba7942792fd7ac78@mail.gmail.com обсуждение исходный текст
Список	pgsql-performance

Дерево обсуждения

I think I have an issue with the planning of this query that sometimes
runs really slow.

this is the output of the EXPLAIN ANALYZE in the SLOW case

Sort  (cost=4105.54..4105.54 rows=2 width=28) (actual
time=11404.225..11404.401 rows=265 loops=1)
   Sort Key: table1.fdeventfromdate, table2.fdsurname, table2.fdtitle
   ->  Nested Loop Left Join  (cost=192.34..4105.53 rows=2 width=28)
(actual time=0.770..11402.185 rows=265 loops=1)
         Join Filter: ("inner".table2_id = "outer".id)
         ->  Nested Loop Left Join  (cost=192.34..878.40 rows=1
width=28) (actual time=0.750..6.878 rows=96 loops=1)
               Join Filter: ("inner".id = "outer".table1_id)
               ->  Nested Loop Left Join  (cost=192.34..872.82 rows=1
width=24) (actual time=0.551..5.453 rows=96 loops=1)
                     ->  Nested Loop Left Join  (cost=192.34..866.86
rows=1 width=28) (actual time=0.534..4.370 rows=96 loops=1)
                           ->  Nested Loop  (cost=192.34..862.46
rows=1 width=28) (actual time=0.515..3.100 rows=96 loops=1)
                                 ->  Bitmap Heap Scan on table2
(cost=192.34..509.00 rows=96 width=24) (actual time=0.488..1.140
rows=96 loops=1)
                                       Recheck Cond: ((id = ...
[CUT]

this query takes 11000 milliseconds

this is the output of the EXPLAIN ANALYZE in the FAST case

Sort  (cost=8946.80..8946.82 rows=10 width=28) (actual
time=286.969..287.208 rows=363 loops=1)
   Sort Key: table1.fdeventfromdate, table2.fdsurname, table2.fdtitle
   ->  Merge Left Join  (cost=8617.46..8946.63 rows=10 width=28)
(actual time=232.330..284.750 rows=363 loops=1)
         Merge Cond: ("outer".id = "inner".table2_id)
         ->  Sort  (cost=946.68..946.69 rows=4 width=28) (actual
time=4.505..4.568 rows=101 loops=1)
               Sort Key: table2.id
               ->  Hash Left Join  (cost=208.33..946.64 rows=4
width=28) (actual time=0.786..4.279 rows=101 loops=1)
                     Hash Cond: ("outer".table1_id = "inner".id)
                     ->  Nested Loop Left Join  (cost=202.35..940.64
rows=4 width=24) (actual time=0.719..4.011 rows=101 loops=1)
                           ->  Nested Loop Left Join
(cost=202.35..916.76 rows=4 width=28) (actual time=0.701..3.165
rows=101 loops=1)
                                 ->  Nested Loop  (cost=202.35..899.50
rows=4 width=28) (actual time=0.676..2.284 rows=101 loops=1)
                                       ->  Bitmap Heap Scan on table2
(cost=202.35..534.18 rows=101 width=24) (actual time=0.644..1.028
rows=101 loops=1)
                                             Recheck Cond: ((id = ...
[CUT]

this time the query takes 290 milliseconds

As you can see the forecast about the returned rows are completely off
in both case but the forecast of 10 rows in the second case is enough
to plan the query in a more clever way.
I tried to increase the default_statistics_target from 10 to 100 and
after I relaunched analyze on the DB on the test machine but this
hasn't improved in any way the situation.
The problem is, the distribution of the data across the tables joined
in this query is quite uneven and I can see the avg_width of the
relations keys is really not a good representative value.
Is there something I can do to improve this situation?

Thanks

Paolo

В списке pgsql-performance по дате отправления:

Предыдущее

От: Zoolin Lin
Дата: 06 марта 2007 г., 20:56:14
Сообщение: Any advantage to integer vs stored date w. timestamp

Следующее

От: "James Mansion"
Дата: 06 марта 2007 г., 22:44:32
Сообщение: compact flash disks?

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

problem with wrong query planning and ineffective statistics

Предыдущее

Следующее