Обсуждение: Bitmap index thoughts

Поиск

Список

Период

Сортировка

Bitmap index thoughts

От

Heikki Linnakangas

Дата:

26 декабря 2006 г., 18:47:56

I've been skimming through the bitmap index patch...

A scan needs to access at least five pages:

1. B-tree index (root+others, depending on depth)
2. The auxiliary heap page
3. bitmap index meta page
4. LOV page
5. bitmap page

That seems like a lot of indirection. A high startup cost is probably ok 
for typical bitmap index use cases and most of the needed pages should 
stay in memory, but could we simplify this? Why do we need the auxiliary 
heap, couldn't we just store the blk+offset of the LOV item directly in 
the b-tree index item?

And instead of having separate LOV pages that store a number of LOV 
items, how about storing each LOV item on a page of it's own, and using 
the rest of the page to store the last chunk of the bitmap. That would 
eliminate one page access, but more importantly, maybe we could then get 
rid of all the bm_last_* attributes in BMLOVItemData that complicate the 
patch quite a bit, while preserving the performance.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: Bitmap index thoughts

От

Gavin Sherry

Дата:

27 декабря 2006 г., 01:15:09

Hey Heikki,

On Tue, 26 Dec 2006, Heikki Linnakangas wrote:

> I've been skimming through the bitmap index patch...
>
> A scan needs to access at least five pages:
>
> 1. B-tree index (root+others, depending on depth)
> 2. The auxiliary heap page
> 3. bitmap index meta page
> 4. LOV page
> 5. bitmap page
>
> That seems like a lot of indirection. A high startup cost is probably ok

You're right, this is excessive and it was something I'd hoped to have
ripped out, but...

> for typical bitmap index use cases and most of the needed pages should
> stay in memory, but could we simplify this? Why do we need the auxiliary
> heap, couldn't we just store the blk+offset of the LOV item directly in
> the b-tree index item?

The problem is, the b-tree code is very much tied to the heap. I don't
want to modify the b-tree code to make bitmap indexes work (better).
What's really tempting is to just manage our own balanced tree within the
bitmap index file(s) itself. It would start from the metapage and simply
spill to other 'special' index pages when necessary. The problem is, we do
not have b-tree code generic enough that it would allow us to do this
trivially -- consider concurrency and WAL in particular, which we
currently get for free. I guess this is why I've been ignoring this issue
:-).

> And instead of having separate LOV pages that store a number of LOV
> items, how about storing each LOV item on a page of it's own, and using
> the rest of the page to store the last chunk of the bitmap. That would
> eliminate one page access, but more importantly, maybe we could then get
> rid of all the bm_last_* attributes in BMLOVItemData that complicate the
> patch quite a bit, while preserving the performance.

That's an interesting approach. We would still need a concept of
last_word, at the very least, and probably last_comp_word for convenience.
Your idea doesn't require any extra space, either, which is good.
Something I've been working through is the idea of a 'bitmap data
segment'. Currently, we store the HRL compressed bitmap data to the extent
of the page. That is, sizeof(BMBitmap) is as close to BLCKSZ as we can
make it. The problem is that we may have some bitmaps where a few values
occur only a small number of times and are well clustered at the beginning
of the heap. In that circumstance, we use a whole page to store a small
number of words and the free space cannot be used by any other vector.
Now, say a segment was some fraction the size of BLCKSZ, we use less space
for those vectors with few tuples in the heap. Just an idea...

Thanks,

Gavin

PS: Another versio of the patch shall be forthcoming shortly. I've been
working on compressing the data in memory during CREATE INDEX instead of
just managing arrays of TIDs in memory as we did previously. The array of
TIDs works great for well clustered data but it stinks for poorly
clustered data as we approach maintenance_word_mem and have to swap a lot.

Re: Bitmap index thoughts

От

"Jie Zhang"

Дата:

27 декабря 2006 г., 08:46:28

>> And instead of having separate LOV pages that store a number of LOV
>> items, how about storing each LOV item on a page of it's own, and using
>> the rest of the page to store the last chunk of the bitmap. That would
>> eliminate one page access, but more importantly, maybe we could then get
>> rid of all the bm_last_* attributes in BMLOVItemData that complicate the
>> patch quite a bit, while preserving the performance.
> 
> That's an interesting approach. We would still need a concept of
> last_word, at the very least, and probably last_comp_word for convenience.
> Your idea doesn't require any extra space, either, which is good.
> Something I've been working through is the idea of a 'bitmap data
> segment'. Currently, we store the HRL compressed bitmap data to the extent
> of the page. That is, sizeof(BMBitmap) is as close to BLCKSZ as we can
> make it. The problem is that we may have some bitmaps where a few values
> occur only a small number of times and are well clustered at the beginning
> of the heap. In that circumstance, we use a whole page to store a small
> number of words and the free space cannot be used by any other vector.
> Now, say a segment was some fraction the size of BLCKSZ, we use less space
> for those vectors with few tuples in the heap. Just an idea...

The "bitmap data segment" sounds good in terms of space. The problem is that
one bitmap is likely to occupy more pages than before, which may hurt the
query performance. I have been thinking along the lines of increasing the
number of last bitmap words stored in each LOV item, but not to occupy one
page. This may prevent some cases Gavin indicated here, but not all.

Thanks,
Jie

Re: Bitmap index thoughts

От

Heikki Linnakangas

Дата:

27 декабря 2006 г., 09:51:41

Gavin Sherry wrote:
> On Tue, 26 Dec 2006, Heikki Linnakangas wrote:
>> for typical bitmap index use cases and most of the needed pages should
>> stay in memory, but could we simplify this? Why do we need the auxiliary
>> heap, couldn't we just store the blk+offset of the LOV item directly in
>> the b-tree index item?
> 
> The problem is, the b-tree code is very much tied to the heap. I don't
> want to modify the b-tree code to make bitmap indexes work (better).
> What's really tempting is to just manage our own balanced tree within the
> bitmap index file(s) itself. It would start from the metapage and simply
> spill to other 'special' index pages when necessary. The problem is, we do
> not have b-tree code generic enough that it would allow us to do this
> trivially -- consider concurrency and WAL in particular, which we
> currently get for free. I guess this is why I've been ignoring this issue
> :-).

Maybe we could reuse the code in ginbtree.c. Looks like Teodor & Oleg 
had the same problem :).

Modifying the nbtree code doesn't seem that difficult either. AFAICS, 
the only places where the heap is from within nbtree code is in index 
building and uniqueness checks.

>> And instead of having separate LOV pages that store a number of LOV
>> items, how about storing each LOV item on a page of it's own, and using
>> the rest of the page to store the last chunk of the bitmap. That would
>> eliminate one page access, but more importantly, maybe we could then get
>> rid of all the bm_last_* attributes in BMLOVItemData that complicate the
>> patch quite a bit, while preserving the performance.
> 
> That's an interesting approach. We would still need a concept of
> last_word, at the very least, and probably last_comp_word for convenience.

Why?

> PS: Another versio of the patch shall be forthcoming shortly. I've been
> working on compressing the data in memory during CREATE INDEX instead of
> just managing arrays of TIDs in memory as we did previously. The array of
> TIDs works great for well clustered data but it stinks for poorly
> clustered data as we approach maintenance_word_mem and have to swap a lot.

Ok, sounds good.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: Bitmap index thoughts

От

Heikki Linnakangas

Дата:

27 декабря 2006 г., 09:59:15

Jie Zhang wrote:
> The "bitmap data segment" sounds good in terms of space. The problem is that
> one bitmap is likely to occupy more pages than before, which may hurt the
> query performance. 

We could have segments of say 1/5 of page. When a bitmap grows larger 
than that, the bitmap would be moved to a page of its own. That way we 
wouldn't get unnecessary fragmentation with large bitmaps, but small 
bitmaps would be stored efficiently.

> I have been thinking along the lines of increasing the
> number of last bitmap words stored in each LOV item, but not to occupy one
> page. This may prevent some cases Gavin indicated here, but not all.

That sounds like more special cases and complexity. I like the segment 
idea more.

But actually I'm not convinced we need to worry about efficient storage 
of small bitmaps at all. The typical use case for bitmap indexes is 
large tables with small number of distinct values, and the problem 
doesn't really arise in that scenario. Let's keep it simple for now, we 
can enhance it in later releases.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: Bitmap index thoughts

От

Gavin Sherry

Дата:

27 декабря 2006 г., 10:11:03

On Wed, 27 Dec 2006, Heikki Linnakangas wrote:

> Gavin Sherry wrote:
> > On Tue, 26 Dec 2006, Heikki Linnakangas wrote:
> >> for typical bitmap index use cases and most of the needed pages should
> >> stay in memory, but could we simplify this? Why do we need the auxiliary
> >> heap, couldn't we just store the blk+offset of the LOV item directly in
> >> the b-tree index item?
> >
> > The problem is, the b-tree code is very much tied to the heap. I don't
> > want to modify the b-tree code to make bitmap indexes work (better).
> > What's really tempting is to just manage our own balanced tree within the
> > bitmap index file(s) itself. It would start from the metapage and simply
> > spill to other 'special' index pages when necessary. The problem is, we do
> > not have b-tree code generic enough that it would allow us to do this
> > trivially -- consider concurrency and WAL in particular, which we
> > currently get for free. I guess this is why I've been ignoring this issue
> > :-).
>
> Maybe we could reuse the code in ginbtree.c. Looks like Teodor & Oleg
> had the same problem :).
>
> Modifying the nbtree code doesn't seem that difficult either. AFAICS,
> the only places where the heap is from within nbtree code is in index
> building and uniqueness checks.

I'll take a look and think about it.

>
> >> And instead of having separate LOV pages that store a number of LOV
> >> items, how about storing each LOV item on a page of it's own, and using
> >> the rest of the page to store the last chunk of the bitmap. That would
> >> eliminate one page access, but more importantly, maybe we could then get
> >> rid of all the bm_last_* attributes in BMLOVItemData that complicate the
> >> patch quite a bit, while preserving the performance.
> >
> > That's an interesting approach. We would still need a concept of
> > last_word, at the very least, and probably last_comp_word for convenience.
>
> Why?

The two last words of the bitmap vector, stored in the LOV item, provide a
few things: they buffer the addition of new matches in the vector and they
ease the calculation of the current position of the end of the vector.
Jie, do you want to add more?

I think we could do this differently by calculating everything based on
the other data stored in the lov item and data page (last_tid_location and
num_hrl_words_used, from memory). But, I'm not sure. Jie?

Thanks,

Gavin

Re: Bitmap index thoughts

От

Gavin Sherry

Дата:

27 декабря 2006 г., 10:17:18

On Wed, 27 Dec 2006, Heikki Linnakangas wrote:

> Jie Zhang wrote:
> > The "bitmap data segment" sounds good in terms of space. The problem is that
> > one bitmap is likely to occupy more pages than before, which may hurt the
> > query performance.
>
> We could have segments of say 1/5 of page. When a bitmap grows larger
> than that, the bitmap would be moved to a page of its own. That way we
> wouldn't get unnecessary fragmentation with large bitmaps, but small
> bitmaps would be stored efficiently.

Yes.

>
> > I have been thinking along the lines of increasing the
> > number of last bitmap words stored in each LOV item, but not to occupy one
> > page. This may prevent some cases Gavin indicated here, but not all.
>
> That sounds like more special cases and complexity. I like the segment
> idea more.
>
> But actually I'm not convinced we need to worry about efficient storage
> of small bitmaps at all. The typical use case for bitmap indexes is
> large tables with small number of distinct values, and the problem
> doesn't really arise in that scenario. Let's keep it simple for now, we
> can enhance it in later releases.

The scenario I'm concerned about is where a sales data base, say, has
100,000 products. However, only 500 or 1000 products are popular. They
dominate, say >99% of the sales. The other 99,900 products consume a
little bit over 8K each for very little benefit :-(.

This is pretty contrived but it seem real world enough...

Gavin

Re: Bitmap index thoughts

От

mark@mark.mielke.cc

Дата:

27 декабря 2006 г., 12:25:56

On Wed, Dec 27, 2006 at 10:16:54PM +1100, Gavin Sherry wrote:
> On Wed, 27 Dec 2006, Heikki Linnakangas wrote:
> > But actually I'm not convinced we need to worry about efficient storage
> > of small bitmaps at all. The typical use case for bitmap indexes is
> > large tables with small number of distinct values, and the problem
> > doesn't really arise in that scenario. Let's keep it simple for now, we
> > can enhance it in later releases.
> The scenario I'm concerned about is where a sales data base, say, has
> 100,000 products. However, only 500 or 1000 products are popular. They
> dominate, say >99% of the sales. The other 99,900 products consume a
> little bit over 8K each for very little benefit :-(.
> This is pretty contrived but it seem real world enough...

Seems like a good candidate for CREATE INDEX WHERE :-)

I wonder what would happen if somebody implemented automatic index
exclusion conditions after use of an INDEX proved to be in the realm
of the worst case scenario? :-)

Cheers,
mark

-- 
mark@mielke.cc / markm@ncf.ca / markm@nortel.com     __________________________
.  .  _  ._  . .   .__    .  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/    |_     |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada
 One ring to rule them all, one ring to find them, one ring to bring them all                      and in the darkness
bindthem...
 
                          http://mark.mielke.cc/

Re: Bitmap index thoughts

От

Heikki Linnakangas

Дата:

27 декабря 2006 г., 14:42:57

mark@mark.mielke.cc wrote:
> On Wed, Dec 27, 2006 at 10:16:54PM +1100, Gavin Sherry wrote:
>> On Wed, 27 Dec 2006, Heikki Linnakangas wrote:
>>> But actually I'm not convinced we need to worry about efficient storage
>>> of small bitmaps at all. The typical use case for bitmap indexes is
>>> large tables with small number of distinct values, and the problem
>>> doesn't really arise in that scenario. Let's keep it simple for now, we
>>> can enhance it in later releases.
>> The scenario I'm concerned about is where a sales data base, say, has
>> 100,000 products. However, only 500 or 1000 products are popular. They
>> dominate, say >99% of the sales. The other 99,900 products consume a
>> little bit over 8K each for very little benefit :-(.
>> This is pretty contrived but it seem real world enough...
> 
> Seems like a good candidate for CREATE INDEX WHERE :-)

Yeah, that was my first thought as well. However, in Gavin's example it 
would be a nightmare to manually update the list popular products, and 
recreate the index when it changes.

Something clever inside bitmap indexam would clearly be better.

But even in that scenario, 99000*8k pages ~= 750 megabytes of wasted 
space might still be acceptable. Or not.

> I wonder what would happen if somebody implemented automatic index
> exclusion conditions after use of an INDEX proved to be in the realm
> of the worst case scenario? :-)

I'm sorry, I don't understand that sentence...

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: Bitmap index thoughts

От

mark@mark.mielke.cc

Дата:

27 декабря 2006 г., 19:44:49

On Wed, Dec 27, 2006 at 03:42:36PM +0000, Heikki Linnakangas wrote:
> >I wonder what would happen if somebody implemented automatic index
> >exclusion conditions after use of an INDEX proved to be in the realm
> >of the worst case scenario? :-)
> I'm sorry, I don't understand that sentence...

I was joking about a rather magical automatic INDEX restriction
modifier. For example, if the index becomes large enough to matter
(100 Mbytes?) and any single key takes up more than, say, 20% of the
index, it might be cool if it would automatically add the value to
the restriction list, and prune the now wasted index records.

Sorry for inserting a silly joke in a serious discussion... :-)

Cheers,
mark

-- 
mark@mielke.cc / markm@ncf.ca / markm@nortel.com     __________________________
.  .  _  ._  . .   .__    .  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/    |_     |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada
 One ring to rule them all, one ring to find them, one ring to bring them all                      and in the darkness
bindthem...

                          http://mark.mielke.cc/

Re: Bitmap index thoughts

От

"Simon Riggs"

Дата:

27 декабря 2006 г., 20:57:38

On Wed, 2006-12-27 at 22:16 +1100, Gavin Sherry wrote:

> > But actually I'm not convinced we need to worry about efficient storage
> > of small bitmaps at all. The typical use case for bitmap indexes is
> > large tables with small number of distinct values, and the problem
> > doesn't really arise in that scenario. Let's keep it simple for now, we
> > can enhance it in later releases.
> 
> The scenario I'm concerned about is where a sales data base, say, has
> 100,000 products. However, only 500 or 1000 products are popular. They
> dominate, say >99% of the sales. The other 99,900 products consume a
> little bit over 8K each for very little benefit :-(.
> 
> This is pretty contrived but it seem real world enough...

Well, that seems to me to be the typical case. It's called a Zipf
Distribution and applies to categorical data everywhere. 

The main question is how you design your database.  We might think of
indexing the product group (a higher level of the Product Dimension),
since these tend to have a low number of values but these may have been
normalised (or "placed into a Dimension table"). This might leave you
with only the ProductId values in the large Fact table, which as Gavin
points out, may be sparsely populated.  

Another idea might to store the first N heap pointers in the auxiliary
heap, rather than allocating them a whole page for the first value. That
would be even more space efficient than allocating a fixed size part of
a page. At least that would provide some utility for that part of the
storage mechanism.

However, I think Heikki's KISG approach seems good for now. The
infrequent values will probably be infrequently accessed, so we may
never need to read them at all (wishful thinking?).

If we can get something ready for commit soon, that will leave lots of
time before 8.3 freeze to measure where things can be further improved
in terms of space and performance.

--  Simon Riggs              EnterpriseDB   http://www.enterprisedb.com

Re: Bitmap index thoughts

От

"Jie Zhang"

Дата:

28 декабря 2006 г., 13:38:07

On 12/27/06 3:10 AM, "Gavin Sherry" <swm@linuxworld.com.au> wrote:

> On Wed, 27 Dec 2006, Heikki Linnakangas wrote:
> 
>> Gavin Sherry wrote:
>>> On Tue, 26 Dec 2006, Heikki Linnakangas wrote:
>>>> for typical bitmap index use cases and most of the needed pages should
>>>> stay in memory, but could we simplify this? Why do we need the auxiliary
>>>> heap, couldn't we just store the blk+offset of the LOV item directly in
>>>> the b-tree index item?
>>> 
>>> The problem is, the b-tree code is very much tied to the heap. I don't
>>> want to modify the b-tree code to make bitmap indexes work (better).
>>> What's really tempting is to just manage our own balanced tree within the
>>> bitmap index file(s) itself. It would start from the metapage and simply
>>> spill to other 'special' index pages when necessary. The problem is, we do
>>> not have b-tree code generic enough that it would allow us to do this
>>> trivially -- consider concurrency and WAL in particular, which we
>>> currently get for free. I guess this is why I've been ignoring this issue
>>> :-).
>> 
>> Maybe we could reuse the code in ginbtree.c. Looks like Teodor & Oleg
>> had the same problem :).
>> 
>> Modifying the nbtree code doesn't seem that difficult either. AFAICS,
>> the only places where the heap is from within nbtree code is in index
>> building and uniqueness checks.
> 
> I'll take a look and think about it.
> 

I will take a look at it as well. I would also like to add eventually the
values for bm_last_tid_location in all bitmap page to the lov heap.
Combining this with the attribute values, we can easily jump into a specific
bitmap page. This will be useful during vacuuming and insertion after
vacuuming -- we can jump into a page to modify a bit. I am not sure if this
will affect the use of the btree code without the heap. But if we can get
rid of the lov heap, that would be great.

>> 
>>>> And instead of having separate LOV pages that store a number of LOV
>>>> items, how about storing each LOV item on a page of it's own, and using
>>>> the rest of the page to store the last chunk of the bitmap. That would
>>>> eliminate one page access, but more importantly, maybe we could then get
>>>> rid of all the bm_last_* attributes in BMLOVItemData that complicate the
>>>> patch quite a bit, while preserving the performance.
>>> 
>>> That's an interesting approach. We would still need a concept of
>>> last_word, at the very least, and probably last_comp_word for convenience.
>> 
>> Why?
> 
> The two last words of the bitmap vector, stored in the LOV item, provide a
> few things: they buffer the addition of new matches in the vector and they
> ease the calculation of the current position of the end of the vector.
> Jie, do you want to add more?
> 
> I think we could do this differently by calculating everything based on
> the other data stored in the lov item and data page (last_tid_location and
> num_hrl_words_used, from memory). But, I'm not sure. Jie?

Sure. When appending a new bit into an existing bitmap, the last two words
are needed to compute new HRL words, and they are the only words affected by
this new appending bit. So we need a way to know these two words easily.
Like Gavin said, separating them from other words is for convenience.

Also, we won't be able to get rid of bm_last_setbit. We need bm_last_setbit
to tell us how many zeros are there between two bit 1s. There is no other
way to know this.

Thanks,
Jie

Re: Bitmap index thoughts

От

"Jie Zhang"

Дата:

28 декабря 2006 г., 16:01:37

On 12/27/06 3:16 AM, "Gavin Sherry" <swm@linuxworld.com.au> wrote:

> On Wed, 27 Dec 2006, Heikki Linnakangas wrote:
> 
>> Jie Zhang wrote:
>>> The "bitmap data segment" sounds good in terms of space. The problem is that
>>> one bitmap is likely to occupy more pages than before, which may hurt the
>>> query performance.
>> 
>> We could have segments of say 1/5 of page. When a bitmap grows larger
>> than that, the bitmap would be moved to a page of its own. That way we
>> wouldn't get unnecessary fragmentation with large bitmaps, but small
>> bitmaps would be stored efficiently.
> 
> Yes.

I see. This sounds good. I will think more about this.

Thanks,
Jie

Re: Bitmap index thoughts

От

Bruce Momjian

Дата:

01 февраля 2007 г., 23:28:53

Where are we on this patch?  Does it have performance tests to show
where it is beneificial?  Is it ready to be reviewed?

---------------------------------------------------------------------------

Heikki Linnakangas wrote:
> I've been skimming through the bitmap index patch...
> 
> A scan needs to access at least five pages:
> 
> 1. B-tree index (root+others, depending on depth)
> 2. The auxiliary heap page
> 3. bitmap index meta page
> 4. LOV page
> 5. bitmap page
> 
> That seems like a lot of indirection. A high startup cost is probably ok 
> for typical bitmap index use cases and most of the needed pages should 
> stay in memory, but could we simplify this? Why do we need the auxiliary 
> heap, couldn't we just store the blk+offset of the LOV item directly in 
> the b-tree index item?
> 
> And instead of having separate LOV pages that store a number of LOV 
> items, how about storing each LOV item on a page of it's own, and using 
> the rest of the page to store the last chunk of the bitmap. That would 
> eliminate one page access, but more importantly, maybe we could then get 
> rid of all the bm_last_* attributes in BMLOVItemData that complicate the 
> patch quite a bit, while preserving the performance.
> 
> -- 
>    Heikki Linnakangas
>    EnterpriseDB   http://www.enterprisedb.com
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings

--  Bruce Momjian   bruce@momjian.us EnterpriseDB    http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +

Re: Bitmap index thoughts

От

Gavin Sherry

Дата:

01 февраля 2007 г., 23:35:24

On Thu, 1 Feb 2007, Bruce Momjian wrote:

>
> Where are we on this patch?  Does it have performance tests to show
> where it is beneificial?  Is it ready to be reviewed?

I've got an updated patch which adds significant performance improvements
for worse case data distributions. It also contains a rewrite of the WAL
code to handle incomplete actions.

I haven't worked on the stuff discussed below with Heikki. It's a lot of
work and probably more suitable for a second generation.

I've just got to finish testing the merge of Tom's operator family stuff
and then I'll send off the patch and performance figures.

Thanks,

Gavin

Re: Bitmap index thoughts

От

Gavin Sherry

Дата:

02 февраля 2007 г., 11:15:34

On Thu, 1 Feb 2007, Bruce Momjian wrote:

>
> Where are we on this patch?  Does it have performance tests to show
> where it is beneificial?  Is it ready to be reviewed?

Here's an updated patch:

http://www.alcove.com.au/~swm/bitmap-2007-02-02.patch

In this patch, I rewrote the index build system. It was fast before for
well clustered data but for poorly clustered data, it was very slow. Now,
it is pretty good for each distribution type.

I have various test cases but the one which showed bitmap a poor light was
a table of 600M rows. The key to the table had a cardinality of 100,000.
When the table was loaded with keys clustered, the build time was 1000
seconds with bitmap (2200 with btree). With poorly clustered data (e.g.,
the index key was (1, 2, 3, ..., 6000, 1, 2, 3, ...)), the build time for
bitmap was 14000 seconds!

So, I rewrote this to compress data using HRL encoding (the same scheme we
use in the bitmap AM itself). Now, clustered data is just as fast and
unclustered data is 2000 seconds.

The select performance at a cardinality of 100,000 is similar to btree but
faster with lower cardinalities.

Jie also contributed a rewrite of the WAL code to this patch. Not only is
the code faster now, but it handles the notion of incomplete actions --
like btree and friends do. The executor code still needs some work from me
-- Jie and I have dirtied things up while experimenting -- but we would
really like some review of the code so that this can get squared away
well before the approach of 8.3 feature freeze.

One of the major deficiencies remaining is the lack of VACUUM support.
Heikki put his hand up for this and I'm holding him to it! ;-)

I will update the code tomorrow. The focus will be cleaning up the
executor modifications. Please look else where for now.

Thanks,

Gavin

Re: Bitmap index thoughts

От

Heikki Linnakangas

Дата:

06 февраля 2007 г., 09:45:43

Gavin Sherry wrote:
> On Thu, 1 Feb 2007, Bruce Momjian wrote:
> 
>> Where are we on this patch?  Does it have performance tests to show
>> where it is beneificial?  Is it ready to be reviewed?
> 
> Here's an updated patch:
> 
> http://www.alcove.com.au/~swm/bitmap-2007-02-02.patch
> 
> In this patch, I rewrote the index build system. It was fast before for
> well clustered data but for poorly clustered data, it was very slow. Now,
> it is pretty good for each distribution type.
> 
> I have various test cases but the one which showed bitmap a poor light was
> a table of 600M rows. The key to the table had a cardinality of 100,000.
> When the table was loaded with keys clustered, the build time was 1000
> seconds with bitmap (2200 with btree). With poorly clustered data (e.g.,
> the index key was (1, 2, 3, ..., 6000, 1, 2, 3, ...)), the build time for
> bitmap was 14000 seconds!
> 
> So, I rewrote this to compress data using HRL encoding (the same scheme we
> use in the bitmap AM itself). Now, clustered data is just as fast and
> unclustered data is 2000 seconds.
> 
> The select performance at a cardinality of 100,000 is similar to btree but
> faster with lower cardinalities.
> 
> Jie also contributed a rewrite of the WAL code to this patch. Not only is
> the code faster now, but it handles the notion of incomplete actions --
> like btree and friends do. The executor code still needs some work from me
> -- Jie and I have dirtied things up while experimenting -- but we would
> really like some review of the code so that this can get squared away
> well before the approach of 8.3 feature freeze.
> 
> One of the major deficiencies remaining is the lack of VACUUM support.
> Heikki put his hand up for this and I'm holding him to it! ;-)

Thanks :). I'll take a look at it.

I'm a bit worried that vacuuming can get complicated if an index is in 
fact an index + a heap + a btree. To remove empty lov items and the 
entries in the auxiliary heap and b-tree, you need to:

1. Memorize empty lov-items
2. Scan the heap, and mark the heap tuples corresponding the empty 
lov-items as dead
3. Scan the b-tree, removing pointers to dead heap tuples
4. Remove dead heap tuples
5. Remove empty lov-items

Maybe it's possible to call the existing vacuuming code recursively, but 
it feels quite horrible.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: Bitmap index thoughts

От

Heikki Linnakangas

Дата:

08 февраля 2007 г., 14:49:40

Gavin Sherry wrote:
> I will update the code tomorrow. The focus will be cleaning up the
> executor modifications. Please look else where for now.

I'm getting a segfault with this test script:

--------
CREATE TABLE bmtest (i int);

INSERT INTO bmtest SELECT 1 FROM generate_series(1,100000) a;
INSERT INTO bmtest SELECT 10 FROM generate_series(1,100) a;
DELETE FROM bmtest WHERE i = 1;
VACUUM bmtest;

CREATE INDEX i_bmtest ON bmtest USING bitmap (i);

INSERT INTO bmtest SELECT 10 FROM generate_series(1,100000) a;
--------


--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: Bitmap index thoughts

От

Gavin Sherry

Дата:

08 февраля 2007 г., 15:28:56

On Thu, 8 Feb 2007, Heikki Linnakangas wrote:

> Gavin Sherry wrote:
> > I will update the code tomorrow. The focus will be cleaning up the
> > executor modifications. Please look else where for now.
>
> I'm getting a segfault with this test script:
>
> --------
> CREATE TABLE bmtest (i int);
>
> INSERT INTO bmtest SELECT 1 FROM generate_series(1,100000) a;
> INSERT INTO bmtest SELECT 10 FROM generate_series(1,100) a;
> DELETE FROM bmtest WHERE i = 1;
> VACUUM bmtest;
>
> CREATE INDEX i_bmtest ON bmtest USING bitmap (i);
>
> INSERT INTO bmtest SELECT 10 FROM generate_series(1,100000) a;
> --------
>

Hmm... this triggers a bug in the newly rewritten update code I think.
I'll post a fix soon.

Thanks for testing!

Gavin

Re: Bitmap index thoughts (another segfault)

От

Mark Kirkwood

Дата:

06 апреля 2007 г., 00:19:06

I'm seeing a segfault on a size TPC-H size 10 database. The patch and 
code are:
- bitmap patch from 12 Mar
- 8.3 dev from 27 Mar


SELECT count(distinct(o_orderkey))
FROM orders orders_alias
WHERE o_orderpriority IN ('1-URGENT', '3-MEDIUM') AND o_orderstatus='P';

(gdb) bt
#0  0x00000000 in ?? ()
#1  0x08155eb5 in bitmap_stream_free (opaque=0x84e4070) at tidbitmap.c:1336
#2  0x08142914 in ExecEndBitmapHeapScan (node=0x8405548)    at nodeBitmapHeapscan.c:463
#3  0x0813789a in ExecutorEnd (queryDesc=0x83ed948) at execMain.c:992
#4  0x081134ef in PortalCleanup (portal=0x83ee018) at portalcmds.c:302
#5  0x0823e2d2 in PortalDrop (portal=0x83ee018, isTopCommit=0 '\0')    at portalmem.c:382
#6  0x081b2182 in exec_simple_query (    query_string=0x83a8018 "SELECT count(distinct(o_orderkey))\nFROM 
orders orders_alias\nWHERE o_orderpriority IN ('1-URGENT', '3-MEDIUM') 
AND o_orderstatus='P';") at postgres.c:964
#7  0x081b4350 in PostgresMain (argc=4, argv=0x833a638,    username=0x833a610 "postgres") at postgres.c:3488
#8  0x0818faab in ServerLoop () at postmaster.c:2985
#9  0x081911b1 in PostmasterMain (argc=1, argv=0xbfbfec30) at 
postmaster.c:967
#10 0x08153592 in main (argc=1, argv=0xbfbfec30) at main.c:188

Re: Bitmap index thoughts (another segfault)

От

Mark Kirkwood

Дата:

07 апреля 2007 г., 03:46:07

Mark Kirkwood wrote:
> I'm seeing a segfault on a size TPC-H size 10 database. The patch and 
> code are:
> - bitmap patch from 12 Mar
> - 8.3 dev from 27 Mar
> 
> 
> SELECT count(distinct(o_orderkey))
> FROM orders orders_alias
> WHERE o_orderpriority IN ('1-URGENT', '3-MEDIUM') AND o_orderstatus='P';
> 
> (gdb) bt
> #0  0x00000000 in ?? ()
> #1  0x08155eb5 in bitmap_stream_free (opaque=0x84e4070) at tidbitmap.c:1336
> #2  0x08142914 in ExecEndBitmapHeapScan (node=0x8405548)
>     at nodeBitmapHeapscan.c:463
> #3  0x0813789a in ExecutorEnd (queryDesc=0x83ed948) at execMain.c:992
> #4  0x081134ef in PortalCleanup (portal=0x83ee018) at portalcmds.c:302
> #5  0x0823e2d2 in PortalDrop (portal=0x83ee018, isTopCommit=0 '\0')
>     at portalmem.c:382
> #6  0x081b2182 in exec_simple_query (
>     query_string=0x83a8018 "SELECT count(distinct(o_orderkey))\nFROM 
> orders orders_alias\nWHERE o_orderpriority IN ('1-URGENT', '3-MEDIUM') 
> AND o_orderstatus='P';") at postgres.c:964
> #7  0x081b4350 in PostgresMain (argc=4, argv=0x833a638,
>     username=0x833a610 "postgres") at postgres.c:3488
> #8  0x0818faab in ServerLoop () at postmaster.c:2985
> #9  0x081911b1 in PostmasterMain (argc=1, argv=0xbfbfec30) at 
> postmaster.c:967
> #10 0x08153592 in main (argc=1, argv=0xbfbfec30) at main.c:188
> 

Not a SIGSEGV but another stream related issue:

bitmap=# \d bitmaptest;  Table "public.bitmaptest" Column |  Type   | Modifiers
--------+---------+----------- id     | integer | not null val0   | integer | val1   | integer | val2   | integer | fil
  | text    |
 
Indexes:    "bitmaptest_id" btree (id)    "bitmaptest_val0" bitmap (val0)    "bitmaptest_val1" bitmap (val1)
"bitmaptest_val2"bitmap (val2)
 
bitmap=# SELECT count(*) FROM bitmaptest           WHERE val1 in (1,7)           AND val0 IN (4,3)           ;

ERROR:  XX000: unknown stream type 2
LOCATION:  stream_add_node, tidbitmap.c:1033

I could not reproduce this with the TPC-H dataset, so here's a link to 
the files to generate this one:
http://homepages.paradise.net.nz/markir/download/bizgres/bitmaptest.tar.gz

Cheers

Mark

Re: Bitmap index thoughts (another segfault)

От

Gavin Sherry

Дата:

09 апреля 2007 г., 02:21:01

> I'm seeing a segfault on a size TPC-H size 10 database. The patch and
> code are:
> - bitmap patch from 12 Mar
> - 8.3 dev from 27 Mar

Thanks Mark. I tracked this down. I'll post a new patch soon.

Gavin

Re: Bitmap index thoughts (another segfault)

От

Gavin Sherry

Дата:

09 апреля 2007 г., 02:22:11

On Sat, 7 Apr 2007, Mark Kirkwood wrote:

> Mark Kirkwood wrote:
> bitmap=# SELECT count(*) FROM bitmaptest
>             WHERE val1 in (1,7)
>             AND val0 IN (4,3)
>             ;
>
> ERROR:  XX000: unknown stream type 2
> LOCATION:  stream_add_node, tidbitmap.c:1033

Thanks. Turned out to be a typo in stream_add_node()! I'll post a new
patch soon. Thanks for the test kit and your testing!

Gavin

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Bitmap index thoughts