Re: Time-correlated columns in large tables

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: Time-correlated columns in large tables
Дата
Msg-id 45EC7ADE.20206@enterprisedb.com
обсуждение исходный текст
Ответ на Time-correlated columns in large tables  ("Jeroen T. Vermeulen" <jtv@xs4all.nl>)
Ответы Re: Time-correlated columns in large tables  ("Jeroen T. Vermeulen" <jtv@xs4all.nl>)
Список pgsql-hackers
Jeroen T. Vermeulen wrote:
> [Q: Is there some other transparent optimization for values that correlate
> with insertion/update order?]
> 
> So I was wondering whether it would make sense to have a more compact kind
> of index.  One that partitions the value range of a given column into
> sub-ranges, and for each of those, merely keeps loose track of where those
> value ranges might be found.  "Dates from 2006-07-15 to 2006-08-04: try
> blocks 99-126 and 175."  Just a fixed maximum of two or three contiguous
> block ranges per value range would probably do the trick.  The risk of
> having too few, of course, is that one oddly positioned block could make
> the index look as if a particular value range was spread out throughout
> most of the table.

I think you've just described a range-encoded bitmap index. The idea is 
to divide the range of valid values into a some smallish number of 
subranges, and for each of these boundary values you store a bitmap 
where you set the bit representing every tuple with value < boundary value.

For example, imagine a simple table like this:

position | key
---------+-----       1 |    1       2 |    2       3 |    3
...      10 |   10

If you divide this into for example 4 subranges, you'd get bitmaps
 key   bitmap
-----+-----------
<= 3 | 1110000000
<= 6 | 1111110000
<= 8 | 1111111100
<= 10| 1111111111

Now to find all tuples <= 3, you'd fetch all tuples in the first bitmap. 
To find all tuples <= 2, you'd fetch all tuples in the first bitmap, and 
recheck the <= 2 condition. To find anything between say 4 and 8, you'd 
take the XOR of the 3rd and 1st bitmap, and recheck the > 4 condition on 
resulting tuples.

So basically this allows you to satisfy any range query by reading two 
bitmaps and XORing them together. The resulting tuples will generally 
need to be rechecked because the index is lossy.

I *really* hope the equality-encoded bitmap index gets into 8.3. I've 
been trying to keep range-encoded bitmaps in mind when looking at the 
proposed indexam API changes. Hopefully we'll see them in 8.4. Feel free 
to submit a patch ;-).

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jeff Davis
Дата:
Сообщение: Re: Bug: Buffer cache is not scan resistant
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Bug: Buffer cache is not scan resistant