Re: Bitmap scan is undercosted?

Поиск
Список
Период
Сортировка
От Vitaliy Garnashevich
Тема Re: Bitmap scan is undercosted?
Дата
Msg-id f8edc163-7ebe-cdb8-d068-685355045f24@gmail.com
обсуждение исходный текст
Ответ на Re: Bitmap scan is undercosted?  (Jeff Janes <jeff.janes@gmail.com>)
Ответы Re: Bitmap scan is undercosted?  (Jeff Janes <jeff.janes@gmail.com>)
Список pgsql-performance
On 02/12/2017 07:51, Jeff Janes wrote: > On Fri, Dec 1, 2017 at 3:54 PM, Vitaliy Garnashevich > > wrote: > > On 02/12/2017 01:11, Justin Pryzby wrote: > > I tried to reproduce this issue and couldn't, under PG95 and 10.1: > > On Fri, Dec 01, 2017 at 12:34:27PM -0600, Justin Pryzby wrote: > > On Fri, Dec 01, 2017 at 07:40:08PM +0200, Vitaliy > Garnashevich wrote: > > We recently had an issue in production, where a bitmap > scan was chosen > instead of an index scan. Despite being 30x slower, > the bitmap scan had > about the same cost as the index scan. > drop table if exists aaa; > create table aaa as select (id%100)::int num, > (id%10=1)::bool flag from > generate_series(1, 10000000) id; > create index i1 on aaa  (num); > create index i2 on aaa  (flag); > analyze aaa; > > What is: > effective_io_concurrency > max_parallel_workers_per_gather (I gather you don't have this) > > effective_io_concurrency = 0 > max_parallel_workers_per_gather = 0 > > Did you notice random_page_cost = 1.5 ? > > > For the aaa.num = 39 case, the faster index scan actually does hit 15 > times more buffers than the bitmap scan does.  While 1.5 is lot lower > than 4.0, it is still much higher than the true cost of reading a page > from the buffer cache.   This why the index scan is getting punished.  > You could lower random_page_cost and seq_page_cost to 0, to remove > those considerations.  (I'm not saying you should do this on your > production system, but rather you should do it as a way to investigate > the issue.  But it might make sense on production as well) seq_page_cost = 1.0 random_page_cost = 1.0* *explain analyze select * from aaa where num = 2 and flag = true; Bitmap Heap Scan on aaa  (cost=11536.74..20856.96 rows=10257 width=5) (actual time=108.338..108.338 rows=0 loops=1)   ->  BitmapAnd  (cost=11536.74..11536.74 rows=10257 width=0) (actual time=108.226..108.226 rows=0 loops=1)         ->  Bitmap Index Scan on i1  (cost=0.00..1025.43 rows=100000 width=0) (actual time=18.563..18.563 rows=100000 loops=1)         ->  Bitmap Index Scan on i2  (cost=0.00..10505.93 rows=1025666 width=0) (actual time=78.493..78.493 rows=1000000 loops=1) Index Scan using i1 on aaa  (cost=0.44..44663.58 rows=10257 width=5) (actual time=51.264..51.264 rows=0 loops=1) Here I've used the filter num = 2, which produces rows=0 at BitmapAnd, and thus avoids a lot of work at "Bitmap Heap Scan" node, while still leaving about the same proportion in bitmap vs index - the bitmap is twice slower but twice less costly. It does not matter much which value to use for the filter, if it's other than num = 1. seq_page_cost = 0.0 random_page_cost = 0.0 explain analyze select * from aaa where num = 2 and flag = true; Bitmap Heap Scan on aaa  (cost=753.00..2003.00 rows=10257 width=5) (actual time=82.212..82.212 rows=0 loops=1)   ->  Bitmap Index Scan on i1  (cost=0.00..750.43 rows=100000 width=0) (actual time=17.401..17.401 rows=100000 loops=1) Index Scan using i1 on aaa  (cost=0.44..1750.43 rows=10257 width=5) (actual time=49.766..49.766 rows=0 loops=1) The bitmap plan was reduced to use only one bitmap scan, and finally it costs more than the index plan. But I doubt that the settings seq_page_cost = random_page_cost = 0.0 should actually be used. Probably it should be instead something like 1.0/1.0 or 1.0/1.1, but other costs increased, to have more weight. # x4 tuple/operator costs - bitmap scan still a bit cheaper set seq_page_cost = 1.0; set random_page_cost = 1.0; set cpu_tuple_cost = 0.04; set cpu_index_tuple_cost = 0.02; set cpu_operator_cost = 0.01; Bitmap Heap Scan on aaa  (cost=36882.97..46587.82 rows=10257 width=5) (actual time=106.045..106.045 rows=0 loops=1)   ->  BitmapAnd  (cost=36882.97..36882.97 rows=10257 width=0) (actual time=105.966..105.966 rows=0 loops=1)         ->  Bitmap Index Scan on i1  (cost=0.00..3276.74 rows=100000 width=0) (actual time=15.977..15.977 rows=100000 loops=1)         ->  Bitmap Index Scan on i2  (cost=0.00..33584.72 rows=1025666 width=0) (actual time=79.208..79.208 rows=1000000 loops=1) Index Scan using i1 on aaa  (cost=1.74..49914.89 rows=10257 width=5) (actual time=50.144..50.144 rows=0 loops=1) # x5 tuple/operator costs - switched to single bitmap index scan, but now it costs more than the index scan set seq_page_cost = 1.0; set random_page_cost = 1.0; set cpu_tuple_cost = 0.05; set cpu_index_tuple_cost = 0.025; set cpu_operator_cost = 0.0125; Bitmap Heap Scan on aaa  (cost=4040.00..54538.00 rows=10257 width=5) (actual time=82.338..82.338 rows=0 loops=1)   ->  Bitmap Index Scan on i1  (cost=0.00..4027.18 rows=100000 width=0) (actual time=19.541..19.541 rows=100000 loops=1) Index Scan using i1 on aaa  (cost=2.17..51665.32 rows=10257 width=5) (actual time=49.545..49.545 rows=0 loops=1) I've also tried seq_page_cost = 1.0, random_page_cost = 1.1, but that would require more than x10 increase in tuple/operator costs, to make bitmap more costly than index. > > > For this test I'm using SSD and Windows (if that matters). On > production we also use SSD, hence lower random_page_cost. But with > the default random_page_cost=4.0, the difference in cost between > the index scan plan and the bitmap scan plan is even bigger. > > > Since it is all shared buffers hits, it doesn't matter if you have SSD > for this particular test case. Agree. I've just tried to justify the value of random_page_cost, which is lower than like 2.0. > Cheers, > > Jeff

В списке pgsql-performance по дате отправления:

Предыдущее
От: Justin Pryzby
Дата:
Сообщение: Re: Bitmap scan is undercosted?
Следующее
От: Jeff Janes
Дата:
Сообщение: Re: Bitmap scan is undercosted?