Re: [PATCH] SVE popcount support
От | Nathan Bossart |
---|---|
Тема | Re: [PATCH] SVE popcount support |
Дата | |
Msg-id | Z6-4amKpP6fab-5a@nathan обсуждение исходный текст |
Ответ на | Re: [PATCH] SVE popcount support (Nathan Bossart <nathandbossart@gmail.com>) |
Ответы |
Re: [PATCH] SVE popcount support
|
Список | pgsql-hackers |
On Thu, Feb 06, 2025 at 10:33:35AM -0600, Nathan Bossart wrote: > On Thu, Feb 06, 2025 at 08:44:35AM +0000, Chiranmoy.Bhattacharya@fujitsu.com wrote: >>> Does this hand-rolled loop unrolling offer any particular advantage? What >>> do the numbers look like if we don't do this or if we process, say, 4 >>> vectors at a time? >> >> The unrolled version performs better than the non-unrolled one, but >> processing four vectors provides no additional benefit. The numbers >> and code used are given below. > > Hm. Any idea why that is? I wonder if the compiler isn't using as many > SVE registers as it could for this. I've also noticed that the latest patch doesn't compile on my M3 macOS machine. After a quick glance, I think the problem is that the TRY_POPCNT_FAST macro is set, so it's trying to compile the assembly versions. ../postgresql/src/port/pg_bitutils.c:230:41: error: invalid output constraint '=q' in asm 230 | __asm__ __volatile__(" popcntl %1,%0\n":"=q"(res):"rm"(word):"cc"); | ^ ../postgresql/src/port/pg_bitutils.c:247:41: error: invalid output constraint '=q' in asm 247 | __asm__ __volatile__(" popcntq %1,%0\n":"=q"(res):"rm"(word):"cc"); | ^ 2 errors generated. -- nathan
В списке pgsql-hackers по дате отправления: