Re: Popcount optimization using AVX512
От | Nathan Bossart |
---|---|
Тема | Re: Popcount optimization using AVX512 |
Дата | |
Msg-id | 20240318173004.GA361138@nathanxps13 обсуждение исходный текст |
Ответ на | Re: Popcount optimization using AVX512 (Nathan Bossart <nathandbossart@gmail.com>) |
Ответы |
Re: Popcount optimization using AVX512
Re: Popcount optimization using AVX512 |
Список | pgsql-hackers |
On Mon, Mar 18, 2024 at 11:20:18AM -0500, Nathan Bossart wrote: > I don't think David was suggesting that we need to remove the runtime > checks for AVX512. IIUC he was pointing out that most of the performance > gain is from removing the function call overhead, which your v8-0002 patch > already does for the proposed AVX512 code. We can apply a similar > optimization for systems without AVX512 by inlining the code for > pg_popcount64() and pg_popcount32(). Here is a more fleshed-out version of what I believe David is proposing. On my machine, the gains aren't quite as impressive (~8.8s to ~5.2s for the test_popcount benchmark). I assume this is because this patch turns pg_popcount() into a function pointer, which is what the AVX512 patches do, too. I left out the 32-bit section from pg_popcount_fast(), but I'll admit that I'm not yet 100% sure that we can assume we're on a 64-bit system there. IMHO this work is arguably a prerequisite for the AVX512 work, as turning pg_popcount() into a function pointer will likely regress performance for folks on systems without AVX512 otherwise. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Вложения
В списке pgsql-hackers по дате отправления: