Re: Popcount optimization using AVX512
От | Nathan Bossart |
---|---|
Тема | Re: Popcount optimization using AVX512 |
Дата | |
Msg-id | 20240402155301.GA2750455@nathanxps13 обсуждение исходный текст |
Ответ на | Re: Popcount optimization using AVX512 (Nathan Bossart <nathandbossart@gmail.com>) |
Ответы |
Re: Popcount optimization using AVX512
|
Список | pgsql-hackers |
On Mon, Apr 01, 2024 at 05:11:17PM -0500, Nathan Bossart wrote: > Here is a v19 of the patch set. I moved out the refactoring of the > function pointer selection code to 0001. I think this is a good change > independent of $SUBJECT, and I plan to commit this soon. In 0002, I > changed the syslogger.c usage of pg_popcount() to use pg_number_of_ones > instead. This is standard practice elsewhere where the popcount functions > are unlikely to win. I'll probably commit this one soon, too, as it's even > more trivial than 0001. > > 0003 is the AVX512 POPCNT patch. Besides refactoring out 0001, there are > no changes from v18. 0004 is an early proof-of-concept for using AVX512 > for the visibility map code. The code is missing comments, and I haven't > performed any benchmarking yet, but I figured I'd post it because it > demonstrates how it's possible to build upon 0003 in other areas. I've committed the first two patches, and I've attached a rebased version of the latter two. > AFAICT the main open question is the function call overhead in 0003 that > Alvaro brought up earlier. After 0002 is committed, I believe the only > in-tree caller of pg_popcount() with very few bytes is bit_count(), and I'm > not sure it's worth expending too much energy to make sure there are > absolutely no regressions there. However, I'm happy to do so if folks feel > that it is necessary, and I'd be grateful for thoughts on how to proceed on > this one. Another idea I had is to turn pg_popcount() into a macro that just uses the pg_number_of_ones array when called for few bytes: static inline uint64 pg_popcount_inline(const char *buf, int bytes) { uint64 popcnt = 0; while (bytes--) popcnt += pg_number_of_ones[(unsigned char) *buf++]; return popcnt; } #define pg_popcount(buf, bytes) \ ((bytes < 64) ? \ pg_popcount_inline(buf, bytes) : \ pg_popcount_optimized(buf, bytes)) But again, I'm not sure this is really worth it for the current use-cases. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Вложения
В списке pgsql-hackers по дате отправления: