On Wed, 27 Dec 2006, Heikki Linnakangas wrote:
> Jie Zhang wrote:
> > The "bitmap data segment" sounds good in terms of space. The problem is that
> > one bitmap is likely to occupy more pages than before, which may hurt the
> > query performance.
>
> We could have segments of say 1/5 of page. When a bitmap grows larger
> than that, the bitmap would be moved to a page of its own. That way we
> wouldn't get unnecessary fragmentation with large bitmaps, but small
> bitmaps would be stored efficiently.
Yes.
>
> > I have been thinking along the lines of increasing the
> > number of last bitmap words stored in each LOV item, but not to occupy one
> > page. This may prevent some cases Gavin indicated here, but not all.
>
> That sounds like more special cases and complexity. I like the segment
> idea more.
>
> But actually I'm not convinced we need to worry about efficient storage
> of small bitmaps at all. The typical use case for bitmap indexes is
> large tables with small number of distinct values, and the problem
> doesn't really arise in that scenario. Let's keep it simple for now, we
> can enhance it in later releases.
The scenario I'm concerned about is where a sales data base, say, has
100,000 products. However, only 500 or 1000 products are popular. They
dominate, say >99% of the sales. The other 99,900 products consume a
little bit over 8K each for very little benefit :-(.
This is pretty contrived but it seem real world enough...
Gavin