On 12/09/2013 11:34 AM, Alexander Korotkov wrote:
> On Mon, Dec 9, 2013 at 1:18 PM, Heikki Linnakangas
> <hlinnakangas@vmware.com>wrote:
>
>> Even if we use varbyte encoding, I wonder if it would be better to treat
>> block + offset number as a single 48-bit integer, rather than encode them
>> separately. That would allow the delta of two items on the same page to be
>> stored as a single byte, rather than two bytes. Naturally it would be a
>> loss on other values, but would be nice to see some kind of an analysis on
>> that. I suspect it might make the code simpler, too.
>
> Yeah, I had that idea, but I thought it's not a better option. Will try to
> do some analysis.
The more I think about that, the more convinced I am that it's a good
idea. I don't think it will ever compress worse than the current
approach of treating block and offset numbers separately, and, although
I haven't actually tested it, I doubt it's any slower. About the same
amount of arithmetic is required in both versions.
Attached is a version that does that. Plus some other minor cleanup.
(we should still investigate using a completely different algorithm, though)
- Heikki