Re: CRC32C Parallel Computation Optimization on ARM

Поиск
Список
Период
Сортировка
От Nathan Bossart
Тема Re: CRC32C Parallel Computation Optimization on ARM
Дата
Msg-id 20231025154325.GB981848@nathanxps13
обсуждение исходный текст
Ответ на RE: CRC32C Parallel Computation Optimization on ARM  (Xiang Gao <Xiang.Gao@arm.com>)
Ответы RE: CRC32C Parallel Computation Optimization on ARM
Список pgsql-hackers
+pg_crc32c
+pg_comp_crc32c_with_vmull_armv8(pg_crc32c crc, const void *data, size_t len)

It looks like most of this function is duplicated from
pg_comp_crc32c_armv8().  I understand that we probably need a separate
function because of the runtime check, but perhaps we could create a common
static inline helper function with a branch for when vmull_p64() can be
used.  It's callers would then just provide a boolean to indicate which
branch to take.

+# Use ARM VMULL if available and ARM CRC32C intrinsic is avaliable too.
+if test x"$USE_ARMV8_VMULL" = x"" && (test x"$USE_ARMV8_CRC32C" = x"1" || test x"$USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK"
=x"1"); then
 
+  if test x"$pgac_armv8_vmull_intrinsics" = x"yes"; then
+    USE_ARMV8_VMULL=1
+  fi
+fi

Hm.  I wonder if we need to switch to a runtime check in some cases.  For
example, what happens if the ARMv8 intrinsics used today are found with the
default compiler flags, but vmull_p64() is only available if
-march=armv8-a+crypto is added?  It looks like the precedent is to use a
runtime check if we need extra CFLAGS to produce code that uses the
intrinsics.

Separately, I wonder if we should just always do runtime checks for the CRC
stuff whenever we can produce code with the intrinics, regardless of
whether we need extra CFLAGS.  The check doesn't look terribly expensive,
and it might allow us to simplify the code a bit (especially now that we
support a few different architectures).

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: Document aggregate functions better w.r.t. ORDER BY
Следующее
От: Daniele Varrazzo
Дата:
Сообщение: Re: libpq async connection and multiple hosts