Re: BUG #16104: Invalid DSA Memory Alloc Request in Parallel Hash

Поиск

Список

Период

Сортировка

От	Thomas Munro
Тема	Re: BUG #16104: Invalid DSA Memory Alloc Request in Parallel Hash
Дата	21 декабря 2019 г. 03:02:28
Msg-id	CA+hUKGKA+tKg1v29=mfxbTkXSoRdFTzmD3R0U2YgwxRxLbLtMA@mail.gmail.com обсуждение исходный текст
Ответ на	Re: BUG #16104: Invalid DSA Memory Alloc Request in Parallel Hash (Thomas Munro <thomas.munro@gmail.com>)
Ответы	Re: BUG #16104: Invalid DSA Memory Alloc Request in Parallel Hash
Список	pgsql-bugs

Дерево обсуждения

On Sat, Dec 14, 2019 at 1:22 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> > >On 2019-Dec-10, Tomas Vondra wrote:
> > >> It's probably still a good idea to do this, although I wonder if we
> > >> might start doing this only when actually running out of bits (and use
> > >> the current algorithm by default). But I have no idea if that would be
> > >> any cheaper, and it would add complexity.

> -               *batchno = (hashvalue >> hashtable->log2_nbuckets) &
> (nbatch - 1);
> +               *batchno = rotate(hashvalue, hashtable->log2_nbuckets)
> & (nbatch - 1);

I ran the 150MB 4096 batch "sevenb" self-join with the "rotate" patch,
and it worked as expected.  I'm now planning to commit that version,
unless there are objections or someone wants to argue for a different
way to spell rotate() etc.

If I disassemble and diff unpatched and patched
ExecHashGetBucketAndBatch() and functions that inline it, as compiled
by Clang 6, I see just eg:

-   0x00000000006ba54e <+30>:    shr    %cl,%esi
+   0x00000000006ba54e <+30>:    ror    %cl,%esi

Here's a festive pictorial representation of the various dead horses
flogged in this thread, for the holidays (view in monospace font):

The "shift" algorithm in (pre-existing coding):

 <----------batchno---|
                      |---bucketno---|
     [MSB . . . . . . . . . . . . LSB]

* can't increase nbuckets (or max bucket load would degrade too soon)
* reads zeroes off the end, so further partitioning is broken

The "bit reverse" technique proposed earlier:

     |---batchno----------->
                      <---bucketno---|
     [MSB . . . . . . . . . . . . LSB]

* could increase nbuckets: bucket load degradation is maximally deferred
* takes a few more instructions to compute batchno

The "rotate" technique:

      ------batchno---|          <---
                      |---bucketno---|
     [MSB . . . . . . . . . . . . LSB]

* same as "shift" until you run out of bits, then bucket load degrades
* can't increase nbuckets
* only changes one machine code instruction

The "swap ends" technique:

                <----------batchno---|
     |---bucketno---|
     [MSB . . . . . . . . . . . . LSB]

* really just a rotated version of "rotate" technique
* meh

В списке pgsql-bugs по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: BUG #16104: Invalid DSA Memory Alloc Request in Parallel Hash