Union+group by planner estimates way off?

Поиск
Список
Период
Сортировка
От Arthur Ward
Тема Union+group by planner estimates way off?
Дата
Msg-id 23123.68.62.129.152.1068748095.squirrel@award.gotdns.org
обсуждение исходный текст
Ответы Re: Union+group by planner estimates way off?  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-performance
On 7.4 RC2, I'm seeing a case where the query planner estimates are way
out of line after grouping the result of a union. I've tried adjusting the
statistics targets up to 200, and it made no difference in the planner's
estimates. The point of the full query this came from is that it also has
an aggregate function that produces a space-delimited list of commodity &
fak for each id. Does anyone have any suggestions on tweaks to apply or
ways to rewrite this? Is this one of those ugly corners where the query
planner doesn't have a clue how to estimate this (seeing the nice round
200 estimate makes me suspicious)?

EXPLAIN ANALYZE SELECT id FROM
(SELECT id, commodity FROM commodities WHERE commodity IS NOT NULL
   UNION
 SELECT id, fak FROM commodities WHERE fak IS NOT NULL
) all_commodities GROUP BY id;
                                                                           QUERY
PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=15939.16..15939.16 rows=200 width=4) (actual
time=3537.281..3680.418 rows=83306 loops=1)
   ->  Subquery Scan all_commodities  (cost=14002.00..15697.02 rows=96858
width=4) (actual time=2268.052..3214.996 rows=95715 loops=1)
         ->  Unique  (cost=14002.00..14728.44 rows=96858 width=15) (actual
time=2268.043..2881.688 rows=95715 loops=1)
               ->  Sort  (cost=14002.00..14244.15 rows=96858 width=15)
(actual time=2268.037..2527.083 rows=100008 loops=1)
                     Sort Key: id, commodity
                     ->  Append  (cost=0.00..5034.42 rows=96858 width=15)
(actual time=7.402..1220.320 rows=100008 loops=1)
                           ->  Subquery Scan "*SELECT* 1"
(cost=0.00..2401.23 rows=36831 width=15)
(actual time=7.398..590.004 rows=39772 loops=1)
                                 ->  Seq Scan on commodities
(cost=0.00..2032.92 rows=36831 width=15)
(actual time=7.388..468.415 rows=39772
loops=1)
                                       Filter: (commodity IS NOT NULL)
                           ->  Subquery Scan "*SELECT* 2"
(cost=0.00..2633.19 rows=60027 width=14)
(actual time=0.016..408.160 rows=60236 loops=1)
                                 ->  Seq Scan on commodities
(cost=0.00..2032.92 rows=60027 width=14)
(actual time=0.010..221.635 rows=60236
loops=1)
                                       Filter: (fak IS NOT NULL)
 Total runtime: 3783.009 ms
(13 rows)


В списке pgsql-performance по дате отправления:

Предыдущее
От: Andrew Sullivan
Дата:
Сообщение: strange estimate for number of rows
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Union+group by planner estimates way off?