Re: [PATCH v4] Avoid manual shift-and-test logic in AllocSetFreeIndex

Поиск
Список
Период
Сортировка
От Stefan Kaltenbrunner
Тема Re: [PATCH v4] Avoid manual shift-and-test logic in AllocSetFreeIndex
Дата
Msg-id 4A64B0A0.80107@kaltenbrunner.cc
обсуждение исходный текст
Ответ на Re: [PATCH v4] Avoid manual shift-and-test logic in AllocSetFreeIndex  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: [PATCH v4] Avoid manual shift-and-test logic in AllocSetFreeIndex  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Tom Lane wrote:
> Jeremy Kerr <jk@ozlabs.org> writes:
>> Rather than testing single bits in a loop, change AllocSetFreeIndex to
>> use the __builtin_clz() function to calculate the chunk index.
> 
>> This requires a new check for __builtin_clz in the configure script.
> 
>> Results in a ~2% performance increase on sysbench on PowerPC.
> 
> I did some performance testing on this by extracting the
> AllocSetFreeIndex function into a standalone test program that just
> executed it a lot of times in a loop.  And there's a problem: on
> x86_64 it is not much of a win.  The code sequence that gcc generates
> for __builtin_clz is basically
> 
>     bsrl    %eax, %eax
>     xorl    $31, %eax
> 
> and it turns out that Intel hasn't seen fit to put a lot of effort into
> the BSR instruction.  It's constant time, all right, but on most of
> their CPUs that constant time is like 8 or 16 times slower than an ADD;
> cf http://www.intel.com/Assets/PDF/manual/248966.pdf


hmm interesting - I don't have the exact numbers any more but that 
patch(or a previous version of it) definitly showed a noticable 
improvement when I tested with sysbench on a current generation Intel 
Nehalem...


Stefan


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jaime Casanova
Дата:
Сообщение: Re: pg_stat_activity.application_name
Следующее
От: Joshua Brindle
Дата:
Сообщение: Re: [PATCH] SE-PgSQL/tiny rev.2193