use ARM intrinsics in pg_lfind32() where available
От | Nathan Bossart |
---|---|
Тема | use ARM intrinsics in pg_lfind32() where available |
Дата | |
Msg-id | 20220819200829.GA395728@nathanxps13 обсуждение исходный текст |
Ответы |
Re: use ARM intrinsics in pg_lfind32() where available
|
Список | pgsql-hackers |
Hi hackers, This is a follow-up for recent changes that optimized [sub]xip lookups in XidInMVCCSnapshot() on Intel hardware [0] [1]. I've attached a patch that uses ARM Advanced SIMD (Neon) intrinsic functions where available to speed up the search. The approach is nearly identical to the SSE2 version, and the usual benchmark [2] shows similar improvements. writers head simd 8 866 836 16 849 833 32 782 822 64 846 833 128 805 821 256 722 739 512 529 674 768 374 608 1024 268 522 I've tested the patch on a recent macOS (M1 Pro) and Amazon Linux (Graviton2), and I've confirmed that the instructions aren't used on a Linux/Intel machine. I did add a new configure check to see if the relevant intrinsics are available, but I didn't add a runtime check like there is for the CRC instructions since the compilers I used support these intrinsics by default. (I don't think a runtime check would work very well with the inline function, anyway.) AFAICT these intrinsics are pretty standard on aarch64, although IIUC the spec indicates that they are technically optional. I suspect that a simple check for "aarch64" would be sufficient, but I haven't investigated the level of compiler support yet. Thoughts? [0] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=b6ef167 [1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=37a6e5d [2] https://postgr.es/m/057a9a95-19d2-05f0-17e2-f46ff20e9b3e@2ndquadrant.com -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Вложения
В списке pgsql-hackers по дате отправления: