Hi, Ron,
Ron schrieb:
> OK, so here's _a_ way (there are others) to obtain a mapping such that
> if a < b then f(a) < f (b) and
> if a == b then f(a) == f(b)
>
> Pretend each row is a integer of row size (so a 2KB row becomes a 16Kb
> integer; a 4KB row becomes a 32Kb integer; etc)
> Since even a 1TB table made of such rows can only have 256M - 512M
> possible values even if each row is unique, a 28b or 29b key is large
> enough to represent each row's value and relative rank compared to all
> of the others even if all row values are unique.
>
> By scanning the table once, we can map say 0000001h (Hex used to ease
> typing) to the row with the minimum value and 1111111h to the row with
> the maximum value as well as mapping everything in between to their
> appropriate keys. That same scan can be used to assign a pointer to
> each record's location.
But with a single linear scan, this cannot be accomplished, as the table
contents are neither sorted nor distributed linearly between the minimum
and the maximum.
For this mapping, you need a full table sort.
> That initial scan to set up the keys is expensive, but if we wish that
> cost can be amortized over the life of the table so we don't have to pay
> it all at once. In addition, once we have created those keys, then can
> be saved for later searches and sorts.
But for every update or insert, you have to resort the keys, which is
_very_ expensive as it basically needs to update a huge part of the table.
Markus