Re: Using POPCNT and other advanced bit manipulation instructions

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: Using POPCNT and other advanced bit manipulation instructions
Дата
Msg-id 20190215165513.64ptbtt3cn3ezfxb@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: Using POPCNT and other advanced bit manipulation instructions  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Hi,

On 2019-02-14 16:45:38 -0500, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2019-02-14 15:47:13 -0300, Alvaro Herrera wrote:
> >> Hah, I just realized you have to add -mlzcnt in order for these builtins
> >> to use the lzcnt instructions.  It goes from something like
> >> 
> >> bsrq    %rax, %rax
> >> xorq    $63, %rax
> 
> > I'm confused how this is a general count leading zero operation? Did you
> > use constants or something that allowed ot infer a range in the test? If
> > so the compiler probably did some optimizations allowing it to do the
> > above.
> 
> No.  If you compile
> 
> int myclz(unsigned long long x)
> {
>   return __builtin_clzll(x);
> }
> 
> at -O2, on just about any x86_64 gcc, you will get
> 
> myclz:
> .LFB1:
>         .cfi_startproc
>         bsrq    %rdi, %rax
>         xorq    $63, %rax
>         ret
>         .cfi_endproc

Yea, sorry for the noise. I misremembered the bsrq mnemonic.

bsr has a latency of three cycles, xor of one. lzcnt a latency of
three. So it's mildly faster to use lzcnt (it uses fewer ports, and has
a shorter latency). But I doubt we have code where that's noticable.

Greetings,

Andres Freund


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Using POPCNT and other advanced bit manipulation instructions
Следующее
От: Andres Freund
Дата:
Сообщение: Re: shared-memory based stats collector