On Thu, Mar 12, 2020 at 7:42 AM David Rowley <dgrowleyml@gmail.com> wrote:
>
> I don't think Jesse's proposed solution is that great due to the
> additional function call overhead for pg_count_leading_zeros_32(). The
> (num & (num - 1)) == 0 I imagine will perform better, but I didn't
> test it.
Right, I believe we've all landed on the same page about that. I see
two ways of doing next_power_of_2_32 without an indirect function
call, and leaving pg_leftmost_one_pos32 the same as it is now. I
haven't measured either yet (or tested for that matter):
static inline uint32
next_power_of_2_32(uint32 num)
{
Assert(num > 0 && num <= UINT_MAX / 2);
/* use some bitmasking tricks to see if only 1 bit is on */
if (num & (num - 1)) == 0)
return num;
return ((uint32) 1) << (pg_leftmost_one_pos32(num) + 1)
}
OR
{
Assert(num > 0 && num <= UINT_MAX / 2);
return ((uint32) 1) << ceil_log2_32(num);
}
static inline uint32
ceil_log2_32(uint32 num)
{
Assert(num > 0);
if (num == 1)
return 0;
return pg_leftmost_one_pos32(num-1) + 1;
}
One naming thing I noticed: the name "next power of two" implies to me
num *= 2 for a power of two, not the same as the input. The latter
behavior is better called "ceil power of 2".
> Also, wondering if you've looked at any of the other places where we
> do "*= 2;" or "<<= 1;" inside a loop? There's quite a number that look
> like candidates for using the new function.
A brief look shows a few functions where this is done in a tight loop:
nodes/list.c:new_list
LWLockRegisterTranche
ensure_record_cache_typmod_slot_exists
pqCheckOutBufferSpace
ExecChooseHashTableSize
ExecHashBuildSkewHash
choose_nelem_alloc
init_htab
hash_estimate_size
hash_select_dirsize
AllocSetAlloc
--
John Naylor https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services