Re: [PATCH] SVE popcount support
От | Nathan Bossart |
---|---|
Тема | Re: [PATCH] SVE popcount support |
Дата | |
Msg-id | Z94xjuN9X7J9lSdT@nathan обсуждение исходный текст |
Ответ на | Re: [PATCH] SVE popcount support ("Chiranmoy.Bhattacharya@fujitsu.com" <Chiranmoy.Bhattacharya@fujitsu.com>) |
Ответы |
Re: [PATCH] SVE popcount support
Re: [PATCH] SVE popcount support |
Список | pgsql-hackers |
I've been preparing these for commit, and I've attached what I have so far. A few notes: * 0001 just renames the TRY_POPCNT_FAST macro to indicate that it's x86_64-specific. IMO this is worth doing indpendent of this patch set, but it's more important with the patch set since we need something similar for Aarch64. I think we should also consider moving the x86_64 stuff to its own file (perhaps combining it with the AVX-512 stuff), but that can probably wait until later. * 0002 introduces the Neon implementation, which conveniently doesn't need configure-time checks or function pointers. I noticed that some compilers (e.g., Apple clang 16) compile in Neon instructions already, but our hand-rolled implementation is better about instruction-level parallelism and seems to still be quite a bit faster. * 0003 introduces the SVE implementation. You'll notice I've moved all the function pointer gymnastics into the pg_popcount_aarch64.c file, which is where the Neon implementations live, too. I also tried to clean up the configure checks a bit. I imagine it's possible to make them more compact, but I felt that the enhanced readability was worth it. * For both Neon and SVE, I do see improvements with looping over 4 registers at a time, so IMHO it's worth doing so even if it performs the same as 2-register blocks on some hardware. I did add a 2-register block in the Neon implementation for processing the tail because I was worried about its performance on smaller buffers, but that part might get removed if I can't measure any difference. I'm planning to run several more benchmarks, but everything I've seen thus far has looked pretty good. -- nathan
Вложения
В списке pgsql-hackers по дате отправления: