On Sat, Dec 14, 2019 at 1:22 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> > >On 2019-Dec-10, Tomas Vondra wrote:
> > >> It's probably still a good idea to do this, although I wonder if we
> > >> might start doing this only when actually running out of bits (and use
> > >> the current algorithm by default). But I have no idea if that would be
> > >> any cheaper, and it would add complexity.
> - *batchno = (hashvalue >> hashtable->log2_nbuckets) &
> (nbatch - 1);
> + *batchno = rotate(hashvalue, hashtable->log2_nbuckets)
> & (nbatch - 1);
I ran the 150MB 4096 batch "sevenb" self-join with the "rotate" patch,
and it worked as expected. I'm now planning to commit that version,
unless there are objections or someone wants to argue for a different
way to spell rotate() etc.
If I disassemble and diff unpatched and patched
ExecHashGetBucketAndBatch() and functions that inline it, as compiled
by Clang 6, I see just eg:
- 0x00000000006ba54e <+30>: shr %cl,%esi
+ 0x00000000006ba54e <+30>: ror %cl,%esi
Here's a festive pictorial representation of the various dead horses
flogged in this thread, for the holidays (view in monospace font):
The "shift" algorithm in (pre-existing coding):
<----------batchno---|
|---bucketno---|
[MSB . . . . . . . . . . . . LSB]
* can't increase nbuckets (or max bucket load would degrade too soon)
* reads zeroes off the end, so further partitioning is broken
The "bit reverse" technique proposed earlier:
|---batchno----------->
<---bucketno---|
[MSB . . . . . . . . . . . . LSB]
* could increase nbuckets: bucket load degradation is maximally deferred
* takes a few more instructions to compute batchno
The "rotate" technique:
------batchno---| <---
|---bucketno---|
[MSB . . . . . . . . . . . . LSB]
* same as "shift" until you run out of bits, then bucket load degrades
* can't increase nbuckets
* only changes one machine code instruction
The "swap ends" technique:
<----------batchno---|
|---bucketno---|
[MSB . . . . . . . . . . . . LSB]
* really just a rotated version of "rotate" technique
* meh