Why does query planner choose slower BitmapAnd ?

Поиск
Список
Период
Сортировка
От Seamus Abshere
Тема Why does query planner choose slower BitmapAnd ?
Дата
Msg-id 1456154321.976561.528310154.6A623C0E@webmail.messagingengine.com
обсуждение исходный текст
Ответы Re: Why does query planner choose slower BitmapAnd ?  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-general
hi,

I don't understand why the query planner is choosing a BitmapAnd when an
Index Scan followed by a filter is obviously better.

(Note that "new_york_houses" is a view of table "houses" with one
condition on city - and there is an index idx_houses_city. That is the
Index Scan that I think it should use exclusively.)

Here's a fast query that uses the Index Scan followed by a filter:

> => explain analyze SELECT COUNT(DISTINCT id) FROM new_york_houses WHERE roof_area >= 0 AND roof_area < 278.7091;
>                                                                                QUERY PLAN
> ---
>  Aggregate  (cost=167298.10..167298.11 rows=1 width=16) (actual time=141.137..141.137 rows=1 loops=1)
>    ->  Index Scan using idx_houses_city on households  (cost=0.57..167178.87 rows=47694 width=16) (actual
time=0.045..105.953rows=53971 loops=1) 
>          Index Cond: (city = 'New York'::text)
>          Filter: ((roof_area >= 0) AND ((roof_area)::numeric < 278.7091))
>          Rows Removed by Filter: 101719
>  Planning time: 0.688 ms
>  Execution time: 141.250 ms
> (7 rows)

When I add another condition, "phoneable", however, it chooses an
obviously wrong plan:

> => explain analyze SELECT COUNT(DISTINCT id) FROM new_york_houses WHERE roof_area >= 0 AND roof_area < 278.7091 AND
phoneable= true; 
>                                                                                      QUERY PLAN
> ---
>  Aggregate  (cost=128163.05..128163.06 rows=1 width=16) (actual time=4564.677..4564.677 rows=1 loops=1)
>    ->  Bitmap Heap Scan on households  (cost=105894.80..128147.78 rows=6106 width=16) (actual time=4456.690..4561.416
rows=5183loops=1) 
>          Recheck Cond: (city = 'New York'::text)
>          Filter: (phoneable AND (roof_area >= 0) AND ((roof_area)::numeric < 278.7091))
>          Rows Removed by Filter: 40103
>          Heap Blocks: exact=14563
>          ->  BitmapAnd  (cost=105894.80..105894.80 rows=21002 width=0) (actual time=4453.510..4453.510 rows=0
loops=1)
>                ->  Bitmap Index Scan on idx_houses_city  (cost=0.00..1666.90 rows=164044 width=0) (actual
time=16.505..16.505rows=155690 loops=1) 
>                      Index Cond: (city = 'New York'::text)
>                ->  Bitmap Index Scan on idx_houses_phoneable  (cost=0.00..104224.60 rows=10271471 width=0) (actual
time=4384.461..4384.461rows=10647041 loops=1) 
>                      Index Cond: (phoneable = true)
>  Planning time: 0.709 ms
>  Execution time: 4565.067 ms
> (13 rows)

On Postgres 9.4.4 with 244gb memory and SSDs

maintenance_work_mem 1000000
work_mem 500000
random_page_cost 1
seq_page_cost 2

The "houses" table has been analyzed recently and has statistics set to
the max.

Thanks,
Seamus


--
Seamus Abshere, SCEA
https://github.com/seamusabshere
https://www.linkedin.com/in/seamusabshere


В списке pgsql-general по дате отправления:

Предыдущее
От: Stephen Frost
Дата:
Сообщение: Re: Why is my database so big?
Следующее
От: Victor Yegorov
Дата:
Сообщение: bpchar, text and indexes