On Tue, Dec 19, 2017 at 11:55 AM, Haisheng Yuan <hyuan@pivotal.io> wrote:
Hi hackers,
This is Haisheng Yuan from Greenplum Database.
We had some query in production showing that planner favors seqscan over bitmapscan, and the execution of seqscan is 5x slower than using bitmapscan, but the cost of bitmapscan is 2x the cost of seqscan. The statistics were updated and quite accurate.
Bitmap table scan uses a formula to interpolate between random_page_cost and seq_page_cost to determine the cost per page. In Greenplum Database, the default value of random_page_cost is 100, the default value of seq_page_cost is 1. With the original cost formula, random_page_cost dominates in the final cost result, even the formula is declared to be non-linear.
My first inclination would be take this as evidence that 100 is a poor default for random_page_cost, rather than as evidence that the bitmap heap scan IO cost model is wrong.
Could you try the low level benchmark I posted elsewhere in the thread on your hardware for reading 1/3 or 1/2 of the pages, in order? Maybe your kernel/system does a better job of predicting read ahead.