Re: CRC32C Parallel Computation Optimization on ARM
| От | Nathan Bossart |
|---|---|
| Тема | Re: CRC32C Parallel Computation Optimization on ARM |
| Дата | |
| Msg-id | 20231110163608.GA1225566@nathanxps13 обсуждение исходный текст |
| Ответ на | RE: CRC32C Parallel Computation Optimization on ARM (Xiang Gao <Xiang.Gao@arm.com>) |
| Ответы |
RE: CRC32C Parallel Computation Optimization on ARM
|
| Список | pgsql-hackers |
On Tue, Nov 07, 2023 at 08:05:45AM +0000, Xiang Gao wrote:
> I think I understand what you mean, this is the latest patch. Thank you!
Thanks for the new patch.
+# PGAC_ARMV8_VMULL_INTRINSICS
+# ----------------------------
+# Check if the compiler supports the vmull_p64
+# intrinsic functions. These instructions
+# were first introduced in ARMv8 crypto Extension.
I wonder if it'd be better to call this PGAC_ARMV8_CRYPTO_INTRINSICS since
this check seems to indicate the presence of +crypto. Presumably there are
other instructions in this extension that could be used elsewhere, in which
case we could reuse this.
+# Use ARM VMULL if available and ARM CRC32C intrinsic is avaliable too.
+if test x"$USE_ARMV8_VMULL" = x"" && test x"$USE_ARMV8_VMULL_WITH_RUNTIME_CHECK" = x"" && (test x"$USE_ARMV8_CRC32C" =
x"1"|| test x"$USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK" = x"1"); then
+ if test x"$pgac_armv8_vmull_intrinsics" = x"yes" && test x"$CFLAGS_VMULL" = x""; then
+ USE_ARMV8_VMULL=1
+ else
+ if test x"$pgac_armv8_vmull_intrinsics" = x"yes"; then
+ USE_ARMV8_VMULL_WITH_RUNTIME_CHECK=1
+ fi
+ fi
+fi
I'm not sure I see the need to check USE_ARMV8_CRC32C* when setting these.
Couldn't we set them solely on the results of our
PGAC_ARMV8_VMULL_INTRINSICS check? It looks like this is what you are
doing in meson.build already.
+extern pg_crc32c pg_comp_crc32c_with_vmull_armv8(pg_crc32c crc, const void *data, size_t len);
nitpick: Maybe pg_comp_crc32_armv8_parallel()?
-# all versions of pg_crc32c_armv8.o need CFLAGS_CRC
-pg_crc32c_armv8.o: CFLAGS+=$(CFLAGS_CRC)
-pg_crc32c_armv8_shlib.o: CFLAGS+=$(CFLAGS_CRC)
-pg_crc32c_armv8_srv.o: CFLAGS+=$(CFLAGS_CRC)
Why are these lines deleted?
- ['pg_crc32c_armv8', 'USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK', 'crc'],
+ ['pg_crc32c_armv8', 'USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK'],
What is the purpose of this change?
+__attribute__((target("+crc+crypto")))
I'm not sure we can assume that all compilers will understand this, and I'm
not sure we need it.
+ if (use_vmull)
+ {
+/*
+ * Crc32c parallel computation Input data is divided into three
+ * equal-sized blocks. Block length : 42 words(42 * 8 bytes).
+ * CRC0: 0 ~ 41 * 8,
+ * CRC1: 42 * 8 ~ (42 * 2 - 1) * 8,
+ * CRC2: 42 * 2 * 8 ~ (42 * 3 - 1) * 8.
+ */
Shouldn't we surround this with #ifdefs for USE_ARMV8_VMULL*?
if (pg_crc32c_armv8_available())
+ {
+#if defined(USE_ARMV8_VMULL)
+ pg_comp_crc32c = pg_comp_crc32c_with_vmull_armv8;
+#elif defined(USE_ARMV8_VMULL_WITH_RUNTIME_CHECK)
+ if (pg_vmull_armv8_available())
+ {
+ pg_comp_crc32c = pg_comp_crc32c_with_vmull_armv8;
+ }
+ else
+ {
+ pg_comp_crc32c = pg_comp_crc32c_armv8;
+ }
+#else
pg_comp_crc32c = pg_comp_crc32c_armv8;
+#endif
+ }
IMO it'd be better to move the #ifdefs into the functions so that we can
simplify this to something like
if (pg_crc32c_armv8_available())
{
if (pg_crc32c_armv8_crypto_available())
pg_comp_crc32c = pg_comp_crc32c_armv8_parallel;
else
pg_comp_crc32c = pg_comp_crc32c_armv8;
else
pc_comp_crc32c = pg_comp_crc32c_sb8;
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
В списке pgsql-hackers по дате отправления: