autovectorize page checksum code included elsewhere
От | Nathan Bossart |
---|---|
Тема | autovectorize page checksum code included elsewhere |
Дата | |
Msg-id | 20231107024734.GB729644@nathanxps13 обсуждение исходный текст |
Ответы |
Re: autovectorize page checksum code included elsewhere
Re: autovectorize page checksum code included elsewhere Re: autovectorize page checksum code included elsewhere |
Список | pgsql-hackers |
(Unfortunately, I'm posting this too late for the November commitfest, but I'm hoping this will be the first in a series of proposed improvements involving SIMD instructions for v17.) Presently, we ask compilers to autovectorize checksum.c and numeric.c. The page checksum code actually lives in checksum_impl.h, and checksum.c just includes it. But checksum_impl.h is also used in pg_upgrade/file.c and pg_checksums.c, and since we don't ask compilers to autovectorize those files, the page checksum code may remain un-vectorized. The attached patch is a quick attempt at adding CFLAGS_UNROLL_LOOPS and CFLAGS_VECTORIZE to the CFLAGS for the aforementioned objects. The gains are modest (i.e., some system CPU and/or a few percentage points on the total time), but it seemed like a no-brainer. Separately, I'm wondering whether we should consider using CFLAGS_VECTORIZE on the whole tree. Commit fdea253 seems to be responsible for introducing this targeted autovectorization strategy, and AFAICT this was just done to minimize the impact elsewhere while optimizing page checksums. Are there fundamental problems with adding CFLAGS_VECTORIZE everywhere? Or is it just waiting on someone to do the analysis/benchmarking? [0] https://postgr.es/m/1367013190.11576.249.camel%40sussancws0025 -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Вложения
В списке pgsql-hackers по дате отправления: