Re: Use compiler intrinsics for bit ops in hash

Поиск
Список
Период
Сортировка
От David Fetter
Тема Re: Use compiler intrinsics for bit ops in hash
Дата
Msg-id 20200308183405.GZ13804@fetter.org
обсуждение исходный текст
Ответ на Re: Use compiler intrinsics for bit ops in hash  (Jesse Zhang <sbjesse@gmail.com>)
Ответы Re: Use compiler intrinsics for bit ops in hash  (Jesse Zhang <sbjesse@gmail.com>)
Список pgsql-hackers
On Mon, Mar 02, 2020 at 12:45:21PM -0800, Jesse Zhang wrote:
> Hi David,
> 
> On Wed, Feb 26, 2020 at 9:56 PM David Fetter <david@fetter.org> wrote:
> >
> > On Wed, Feb 26, 2020 at 09:12:24AM +0100, David Fetter wrote:
> > > On Fri, Jan 31, 2020 at 04:59:18PM +0100, David Fetter wrote:
> > > > On Wed, Jan 15, 2020 at 03:45:12PM -0800, Jesse Zhang wrote:
> > > > > On Tue, Jan 14, 2020 at 2:09 PM David Fetter <david@fetter.org> wrote:
> > > > > > > The changes in hash AM and SIMPLEHASH do look like a net positive
> > > > > > > improvement. My biggest cringe might be in pg_bitutils:
> > > > > > >
> > > > > > > 1. Is ceil_log2_64 dead code?
> > > > > >
> > > > > > Let's call it nascent code. I suspect there are places it could go, if
> > > > > > I look for them.  Also, it seemed silly to have one without the other.
> > > > > >
> > > > >
> > > > > While not absolutely required, I'd like us to find at least one
> > > > > place and start using it. (Clang also nags at me when we have
> > > > > unused functions).
> > > >
> > > > Done in the expanded patches attached.
> I see that you've found use of it in dynahash, thanks!
> 
> The math in the new (from v4 to v6) patch is wrong: it yields
> ceil_log2(1) = 1 or next_power_of_2(1) = 2. I can see that you lifted
> the restriction of "num greater than one" for ceil_log2() in this patch
> set, but it's now _more_ problematic to base those functions on
> pg_leftmost_one_pos().
> 
> I'm not comfortable with your changes to pg_leftmost_one_pos() to remove
> the restriction on word being non-zero. Specifically
> pg_leftmost_one_pos() is made to return 0 on 0 input. While none of its
> current callers (in HEAD) is harmed, this introduces muddy semantics:
> 
> 1. pg_leftmost_one_pos is semantically undefined on 0 input: scanning
> for a set bit in a zero word won't find it anywhere.
> 
> 2. we can _try_ generalizing it to accommodate ceil_log2 by
> extrapolating based on the invariant that BSR + LZCNT = 31 (or 63). In
> that case, the extrapolation yields -1 for pg_leftmost_one_pos(0).
> 
> I'm not convinced that others on the list will be comfortable with the
> generalization suggested in 2 above.
> 
> I've quickly put together a PoC patch on top of yours, which
> re-implements ceil_log2 using LZCNT coupled with a CPUID check.
> Thoughts?

Per discussion on IRC with Andrew (RhodiumToad) Gierth:

The runtime detection means there's always an indirect call overhead
and no way to inline.  This is counter to what using compiler
intrinsics is supposed to do.

It's better to rely on the compiler, because:
(a) The compiler often knows whether the value can or can't be 0 and
    can therefore skip a conditional jump.
(b) If you're targeting a recent microarchitecture, the compiler can
    just use the right instruction.
(c) Even if the conditional branch is left in, it's not a big overhead.

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Remove utils/acl.h from catalog/objectaddress.h
Следующее
От: Tom Lane
Дата:
Сообщение: Re: pg11+: pg_ls_*dir LIMIT 1: temporary files .. not closed at end-of-transaction