Re: [PATCH] SVE popcount support
От | Nathan Bossart |
---|---|
Тема | Re: [PATCH] SVE popcount support |
Дата | |
Msg-id | Z-R1OP2s3mYs_DIP@nathan обсуждение исходный текст |
Ответ на | Re: [PATCH] SVE popcount support (Nathan Bossart <nathandbossart@gmail.com>) |
Ответы |
Re: [PATCH] SVE popcount support
|
Список | pgsql-hackers |
I've attached a new set of patches in which I've tried to address John's feedback. I ran some new benchmarks with these patches. "M3" is an Apple M3 (my laptop), "G3" is an r7g.4xlarge, and "G4" is an r8g.4xlarge. "no SVE" means the patches are applied but the function pointer points to the Neon implementation. "SVE" and "patched" mean all the patches are applied with no changes. 8 byte words | M3 HEAD | M3 patched | G3 HEAD | G3 no SVE | G3 SVE | G4 HEAD | G4 no SVE | G4 SVE --------------+---------+------------+---------+-----------+---------+---------+-----------+--------- 1 | 3.6 | 3.0 | 3.1 | 2.9 | 3.1 | 2.5 | 2.2 | 1.8 2 | 6.4 | 4.4 | 3.1 | 3.0 | 3.1 | 2.5 | 2.5 | 2.0 3 | 7.3 | 6.9 | 3.5 | 3.5 | 3.1 | 3.3 | 3.2 | 2.0 4 | 8.0 | 3.8 | 4.0 | 2.7 | 4.7 | 3.6 | 2.2 | 2.7 5 | 9.4 | 5.5 | 4.6 | 2.8 | 4.6 | 3.9 | 2.5 | 2.7 6 | 7.9 | 5.0 | 5.1 | 3.5 | 4.7 | 4.3 | 3.1 | 3.4 7 | 10.2 | 7.4 | 5.9 | 4.0 | 4.7 | 4.7 | 3.6 | 3.4 8 | 12.0 | 5.4 | 6.5 | 4.0 | 5.9 | 5.0 | 3.2 | 2.5 9 | 11.7 | 6.5 | 7.2 | 4.3 | 5.9 | 5.4 | 3.6 | 2.5 10 | 12.5 | 5.4 | 8.0 | 4.8 | 5.9 | 6.2 | 3.9 | 3.1 11 | 14.0 | 8.6 | 8.5 | 5.5 | 5.9 | 6.1 | 5.0 | 3.1 12 | 13.1 | 5.7 | 9.1 | 5.1 | 7.4 | 6.4 | 3.9 | 3.6 13 | 12.1 | 6.8 | 9.8 | 5.4 | 7.3 | 6.8 | 4.3 | 3.6 14 | 16.4 | 7.8 | 10.4 | 5.9 | 7.4 | 7.2 | 4.7 | 4.4 15 | 17.4 | 8.0 | 11.1 | 6.6 | 7.4 | 7.5 | 5.7 | 4.4 16 | 15.5 | 5.7 | 11.8 | 5.7 | 4.7 | 7.9 | 5.0 | 3.5 32 | 26.0 | 16.2 | 22.7 | 10.3 | 6.2 | 16.8 | 8.4 | 5.2 64 | 38.5 | 20.3 | 42.7 | 20.1 | 9.3 | 31.8 | 15.4 | 8.8 128 | 75.1 | 35.7 | 86.1 | 35.0 | 15.4 | 80.2 | 28.6 | 16.3 256 | 117.7 | 51.8 | 179.6 | 68.2 | 27.8 | 154.0 | 55.7 | 30.9 512 | 198.5 | 93.1 | 329.3 | 134.4 | 52.4 | 246.5 | 110.2 | 59.4 1024 | 355.0 | 159.2 | 673.6 | 265.8 | 101.7 | 487.0 | 219.0 | 114.7 2048 | 669.5 | 288.8 | 1294.7 | 529.7 | 200.3 | 969.3 | 438.7 | 228.5 4096 | 1308.0 | 552.8 | 2784.3 | 1063.0 | 397.4 | 1934.5 | 874.4 | 455.9 IMHO these are acceptable results, at least for the use-cases I see in the tree. We might be able to minimize the difference between the Neon and SVE implementations on the low end with some additional code, but I'm really not sure if it's worth the effort. Barring feedback or objections, I'm planning to commit these on Friday. -- nathan
Вложения
В списке pgsql-hackers по дате отправления: