Re: Using POPCNT and other advanced bit manipulation instructions

Поиск
Список
Период
Сортировка
От David Rowley
Тема Re: Using POPCNT and other advanced bit manipulation instructions
Дата
Msg-id CAKJS1f951NWLpt9T1JdMmfRvo3PCFMLVVuXi1EqV_p5ZULNMJA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Using POPCNT and other advanced bit manipulation instructions  (Dmitry Dolgov <9erthalion6@gmail.com>)
Ответы Re: Using POPCNT and other advanced bit manipulation instructions  (Dmitry Dolgov <9erthalion6@gmail.com>)
Список pgsql-hackers
Thanks for looking at this.

On Thu, 20 Dec 2018 at 23:56, Dmitry Dolgov <9erthalion6@gmail.com> wrote:
> I've checked for Clang 6, it turns out that indeed it generates popcnt without
> any macro, but only in one place for bloom_prop_bits_set. After looking at this
> function it seems that it would be benefitial to actually use popcnt there too.

Yeah, that's the pattern that's mentioned in
https://lemire.me/blog/2016/05/23/the-surprising-cleverness-of-modern-compilers/
It would need to be changed to call the popcount function.  This
existing makes me a bit more worried that some extension could be
using a similar pattern and end up being compiled with -mpopcnt due to
pg_config having that CFLAG. That's all fine until the binary makes
it's way over to a machine without that instruction.

> > I am able to measure performance gains from the patch.  In a 3.4GB
> > table containing a single column with just 10 statistics targets, I
> > got the following times after running ANALYZE on the table.
>
> I've tested it too a bit, and got similar results when the patched version is
> slightly faster. But then I wonder if popcnt is the best solution here, since
> after some short research I found a paper [1], where authors claim that:
>
>     Maybe surprisingly, we show that a vectorized approach using SIMD
>     instructions can be twice as fast as using the dedicated instructions on
>     recent Intel processors.
>
>
> [1]: https://arxiv.org/pdf/1611.07612.pdf

I can't imagine that using the number_of_ones[] array processing
8-bits at a time would be slower than POPCNT though.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: pg_dump multi VALUES INSERT
Следующее
От: Peter Eisentraut
Дата:
Сообщение: Re: Statement-level Triggers For Uniqueness Checks