Обсуждение: Proposal for enabling auto-vectorization for checksum calculations
Hello, This patch enables more compiler autovectorization for the checksum calculations. This code is particularly well suited for autovectorization, so just adding pg_attribute_target and some simple dynamic dispatch logic we can get improved vectorization. This gives about a 2x speedup in a synthetic benchmark for pg_checksum, which is also included as a seperate patch file. Additionally, another 2x performance increase in the synthetic benchmark with AVX2 can be obtained if N_SUMS was changed to 64. However, this would change the results of the checksum. This isn't included in this patch, but I think it is worth considering for the future One additional factor, without explicitly passing some optimization flag like -O2 the makefile build won't autovectorize any of the code. However, the meson based build does this automatically.
Вложения
Hello! I'm still trying to figure out those CI failures, I just wanted to update things. From my testing, with this patch repeatedly disabling/enabling checksums is about 12.4% on an approximately 15 GB database. By the way, I'd love it if anyone could help me figure out how to replicate a CI failure in the Cirrus CI. I haven't been able to figure out how to test CI runs locally, does anyone know a good method to do that?
On Thu, May 8, 2025 at 6:57 AM Matthew Sterrett <matthewsterrett2@gmail.com> wrote:
Hello! I'm still trying to figure out those CI failures, I just wanted
to update things.
From my testing, with this patch repeatedly disabling/enabling
checksums is about 12.4% on an approximately 15 GB database.
By the way, I'd love it if anyone could help me figure out how to
replicate a CI failure in the Cirrus CI.
I haven't been able to figure out how to test CI runs locally, does
anyone know a good method to do that?
Hi Matthew,
Thanks for the patch!
I ran some timing tests:
(without avx2)
Time: 4034.351 ms
SELECT drive_pg_checksum(512);
(with avx2)
Time: 3559.076 ms
SELECT drive_pg_checksum(512);
Also attached two patches that should fix the CI issues.
Best,
Stepan Neretin
Hello! Thanks for helping me with this. I'm still trying to figure out what is going on with the Bookworm test failures. I'm pretty sure this patchset should resolve all the issues with the macOS build, but I don't think it will help the linux failures unfortunately. On Sat, May 10, 2025 at 4:02 AM Stepan Neretin <slpmcf@gmail.com> wrote: > > > > On Sat, May 10, 2025 at 6:01 PM Stepan Neretin <slpmcf@gmail.com> wrote: >> >> >> >> On Thu, May 8, 2025 at 6:57 AM Matthew Sterrett <matthewsterrett2@gmail.com> wrote: >>> >>> Hello! I'm still trying to figure out those CI failures, I just wanted >>> to update things. >>> >>> From my testing, with this patch repeatedly disabling/enabling >>> checksums is about 12.4% on an approximately 15 GB database. >>> >>> By the way, I'd love it if anyone could help me figure out how to >>> replicate a CI failure in the Cirrus CI. >>> I haven't been able to figure out how to test CI runs locally, does >>> anyone know a good method to do that? >>> >>> >> >> Hi Matthew, >> >> Thanks for the patch! >> >> I ran some timing tests: >> >> (without avx2) >> >> Time: 4034.351 ms >> SELECT drive_pg_checksum(512); >> >> (with avx2) >> >> Time: 3559.076 ms >> SELECT drive_pg_checksum(512); >> >> Also attached two patches that should fix the CI issues. >> >> Best, >> >> Stepan Neretin >> >> >> > > Oops, forgot to attach patches :) > > Best, > > Stepan Neretin > >
Вложения
Hi, On Tue, 20 May 2025 at 02:54, Matthew Sterrett <matthewsterrett2@gmail.com> wrote: > > Hello! Thanks for helping me with this. > I'm still trying to figure out what is going on with the Bookworm test > failures. I'm pretty sure this patchset should resolve all the issues > with the macOS build, but I don't think it will help the linux > failures unfortunately. You can see the failure at the artifacts -> 'log/tmp_install/log/install.log' file on the CI web page [1]. If you want to replicate that on your local: $ ./configure --with-llvm CLANG="ccache clang-16" $ make -s -j8 world-bin $ make -j8 check-world should be enough. I was able to replicate it with these commands. I hope these help. [1] https://cirrus-ci.com/task/4834162550505472 -- Regards, Nazir Bilal Yavuz Microsoft
> You can see the failure at the artifacts -> > 'log/tmp_install/log/install.log' file on the CI web page [1]. > > If you want to replicate that on your local: > > $ ./configure --with-llvm CLANG="ccache clang-16" > $ make -s -j8 world-bin > $ make -j8 check-world > > should be enough. I was able to replicate it with these commands. I > hope these help. Thanks so much for helping me figure this out! Okay, I've determined that versions of LLVM/Clang before 19 crash when compiling this patch for some reason; it seems that both make check-world and make install will crash with the affected LLVM versions. Unfortunately, what matters seems to be the version of the linker/LTO optimizer, which I don't think we can check at compile time. I added a check for Clang>=19 which works at preventing the crash on my system. I think it's possible some unusual combination of clang/LLVM might still crash during the build, but I think this is a reasonable solution
Вложения
- v4-0005-Use-dummy-function-to-avoid-linker-error-move-dec.patch
- v4-0004-fix-bench-compiling.patch
- v4-0002-Fix-compilation-on-systems-where-immintrin.h-is-n.patch
- v4-0001-Enable-autovectorizing-pg_checksum_block.patch
- v4-0003-Benchmark-code-for-postgres-checksums.patch
- v4-0006-Workaround-for-clang-19-crash.patch
On Fri, May 23, 2025 at 4:54 AM Matthew Sterrett <matthewsterrett2@gmail.com> wrote: > Okay, I've determined that versions of LLVM/Clang before 19 crash when > compiling this patch for some reason; it seems that both make > check-world and make install will crash with the affected LLVM > versions. > Unfortunately, what matters seems to be the version of the linker/LTO > optimizer, which I don't think we can check at compile time. > I added a check for Clang>=19 which works at preventing the crash on my system. > I think it's possible some unusual combination of clang/LLVM might > still crash during the build, but I think this is a reasonable > solution I don't know if this is related to the crashes, but it doesn't seem like a good idea to #include the function pointer stuff everywhere, that should probably go into src/port like the others. -- John Naylor Amazon Web Services
Hi John, Thanks for the feedback. This is v5 of the patchset, updated following your comments: - Moved the function pointer definitions out of common headers and into src/port, consistent with existing practice. Thanks again for the guidance. Best regards, Kim Andrew
From: Andrew Kim <andrew.kim@intel.com>
---
config/c-compiler.m4 | 31 +++++
configure | 52 +++++++++
configure.ac | 9 ++
meson.build | 28 +++++
src/include/pg_config.h.in | 3 +
src/include/storage/checksum_impl.h | 90 +++-----------
src/port/Makefile | 1 +
src/port/meson.build | 1 +
src/port/pg_checksum_dispatch.c | 174 ++++++++++++++++++++++++++++
9 files changed, 318 insertions(+), 71 deletions(-)
create mode 100644 src/port/pg_checksum_dispatch.c
diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
index da40bd6a647..5eb3218deb5 100644
--- a/config/c-compiler.m4
+++ b/config/c-compiler.m4
@@ -711,6 +711,37 @@ fi
undefine([Ac_cachevar])dnl
])# PGAC_XSAVE_INTRINSICS
+# PGAC_AVX2_SUPPORT
+# -----------------------------
+# Check if the compiler supports AVX2 in attribute((target))
+# and using AVX2 intrinsics in those functions
+#
+# If the intrinsics are supported, sets pgac_avx2_support.
+AC_DEFUN([PGAC_AVX2_SUPPORT],
+[define([Ac_cachevar], [AS_TR_SH([pgac_cv_avx2_support])])dnl
+AC_CACHE_CHECK([for AVX2 support], [Ac_cachevar],
+[AC_LINK_IFELSE([AC_LANG_PROGRAM([#include <immintrin.h>
+ #include <stdint.h>
+ #if defined(__has_attribute) && __has_attribute (target)
+ __attribute__((target("avx2")))
+ #endif
+ static int avx2_test(void)
+ {
+ const char buf@<:@sizeof(__m256i)@:>@;
+ __m256i accum = _mm256_loadu_si256((const __m256i *) buf);
+ accum = _mm256_add_epi32(accum, accum);
+ int result = _mm256_extract_epi32(accum, 0);
+ return (int) result;
+ }],
+ [return avx2_test();])],
+ [Ac_cachevar=yes],
+ [Ac_cachevar=no])])
+if test x"$Ac_cachevar" = x"yes"; then
+ pgac_avx2_support=yes
+fi
+undefine([Ac_cachevar])dnl
+])# PGAC_AVX2_SUPPORT
+
# PGAC_AVX512_POPCNT_INTRINSICS
# -----------------------------
# Check if the compiler supports the AVX-512 popcount instructions using the
diff --git a/configure b/configure
index 39c68161cec..54da05ac0db 100755
--- a/configure
+++ b/configure
@@ -17608,6 +17608,58 @@ $as_echo "#define HAVE_XSAVE_INTRINSICS 1" >>confdefs.h
fi
+# Check for AVX2 target and intrinsic support
+#
+if test x"$host_cpu" = x"x86_64"; then
+ { $as_echo "$as_me:${as_lineno-$LINENO}: checking for AVX2 support" >&5
+$as_echo_n "checking for AVX2 support... " >&6; }
+if ${pgac_cv_avx2_support+:} false; then :
+ $as_echo_n "(cached) " >&6
+else
+ cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h. */
+#include <immintrin.h>
+ #include <stdint.h>
+ #if defined(__has_attribute) && __has_attribute (target)
+ __attribute__((target("avx2")))
+ #endif
+ static int avx2_test(void)
+ {
+ const char buf[sizeof(__m256i)];
+ __m256i accum = _mm256_loadu_si256((const __m256i *) buf);
+ accum = _mm256_add_epi32(accum, accum);
+ int result = _mm256_extract_epi32(accum, 0);
+ return (int) result;
+ }
+int
+main ()
+{
+return avx2_test();
+ ;
+ return 0;
+}
+_ACEOF
+if ac_fn_c_try_link "$LINENO"; then :
+ pgac_cv_avx2_support=yes
+else
+ pgac_cv_avx2_support=no
+fi
+rm -f core conftest.err conftest.$ac_objext \
+ conftest$ac_exeext conftest.$ac_ext
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv_avx2_support" >&5
+$as_echo "$pgac_cv_avx2_support" >&6; }
+if test x"$pgac_cv_avx2_support" = x"yes"; then
+ pgac_avx2_support=yes
+fi
+
+ if test x"$pgac_avx2_support" = x"yes"; then
+
+$as_echo "#define USE_AVX2_WITH_RUNTIME_CHECK 1" >>confdefs.h
+
+ fi
+fi
+
# Check for AVX-512 popcount intrinsics
#
if test x"$host_cpu" = x"x86_64"; then
diff --git a/configure.ac b/configure.ac
index 066e3976c0a..2c484a12671 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2118,6 +2118,15 @@ if test x"$pgac_xsave_intrinsics" = x"yes"; then
AC_DEFINE(HAVE_XSAVE_INTRINSICS, 1, [Define to 1 if you have XSAVE intrinsics.])
fi
+# Check for AVX2 target and intrinsic support
+#
+if test x"$host_cpu" = x"x86_64"; then
+ PGAC_AVX2_SUPPORT()
+ if test x"$pgac_avx2_support" = x"yes"; then
+ AC_DEFINE(USE_AVX2_WITH_RUNTIME_CHECK, 1, [Define to 1 to use AVX2 instructions with a runtime check.])
+ fi
+fi
+
# Check for AVX-512 popcount intrinsics
#
if test x"$host_cpu" = x"x86_64"; then
diff --git a/meson.build b/meson.build
index ab8101d67b2..ff42c41ca7e 100644
--- a/meson.build
+++ b/meson.build
@@ -2289,6 +2289,34 @@ int main(void)
endif
+###############################################################
+# Check for the availability of AVX2 support
+###############################################################
+
+if host_cpu == 'x86_64'
+
+ prog = '''
+#include <immintrin.h>
+#include <stdint.h>
+#if defined(__has_attribute) && __has_attribute (target)
+__attribute__((target("avx2")))
+#endif
+int main(void)
+{
+ const char buf[sizeof(__m256i)];
+ __m256i accum = _mm256_loadu_si256((const __m256i *) buf);
+ accum = _mm256_add_epi32(accum, accum);
+ int result = _mm256_extract_epi32(accum, 0);
+ return (int) result;
+}
+'''
+
+ if cc.links(prog, name: 'AVX2 support', args: test_c_args)
+ cdata.set('USE_AVX2_WITH_RUNTIME_CHECK', 1)
+ endif
+
+endif
+
###############################################################
# Check for the availability of AVX-512 popcount intrinsics.
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index c4dc5d72bdb..987f9b5c77c 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -675,6 +675,9 @@
/* Define to 1 to use AVX-512 CRC algorithms with a runtime check. */
#undef USE_AVX512_CRC32C_WITH_RUNTIME_CHECK
+/* Define to 1 to use AVX2 instructions with a runtime check. */
+#undef USE_AVX2_WITH_RUNTIME_CHECK
+
/* Define to 1 to use AVX-512 popcount instructions with a runtime check. */
#undef USE_AVX512_POPCNT_WITH_RUNTIME_CHECK
diff --git a/src/include/storage/checksum_impl.h b/src/include/storage/checksum_impl.h
index da87d61ba52..82e525529f4 100644
--- a/src/include/storage/checksum_impl.h
+++ b/src/include/storage/checksum_impl.h
@@ -101,12 +101,14 @@
*/
#include "storage/bufpage.h"
+#include "pg_config.h"
/* number of checksums to calculate in parallel */
#define N_SUMS 32
/* prime multiplier of FNV-1a hash */
#define FNV_PRIME 16777619
+
/* Use a union so that this code is valid under strict aliasing */
typedef union
{
@@ -142,74 +144,20 @@ do { \
* Block checksum algorithm. The page must be adequately aligned
* (at least on 4-byte boundary).
*/
-static uint32
-pg_checksum_block(const PGChecksummablePage *page)
-{
- uint32 sums[N_SUMS];
- uint32 result = 0;
- uint32 i,
- j;
-
- /* ensure that the size is compatible with the algorithm */
- Assert(sizeof(PGChecksummablePage) == BLCKSZ);
-
- /* initialize partial checksums to their corresponding offsets */
- memcpy(sums, checksumBaseOffsets, sizeof(checksumBaseOffsets));
-
- /* main checksum calculation */
- for (i = 0; i < (uint32) (BLCKSZ / (sizeof(uint32) * N_SUMS)); i++)
- for (j = 0; j < N_SUMS; j++)
- CHECKSUM_COMP(sums[j], page->data[i][j]);
-
- /* finally add in two rounds of zeroes for additional mixing */
- for (i = 0; i < 2; i++)
- for (j = 0; j < N_SUMS; j++)
- CHECKSUM_COMP(sums[j], 0);
-
- /* xor fold partial checksums together */
- for (i = 0; i < N_SUMS; i++)
- result ^= sums[i];
-
- return result;
-}
-
-/*
- * Compute the checksum for a Postgres page.
- *
- * The page must be adequately aligned (at least on a 4-byte boundary).
- * Beware also that the checksum field of the page is transiently zeroed.
- *
- * The checksum includes the block number (to detect the case where a page is
- * somehow moved to a different location), the page header (excluding the
- * checksum itself), and the page data.
- */
-uint16
-pg_checksum_page(char *page, BlockNumber blkno)
-{
- PGChecksummablePage *cpage = (PGChecksummablePage *) page;
- uint16 save_checksum;
- uint32 checksum;
-
- /* We only calculate the checksum for properly-initialized pages */
- Assert(!PageIsNew((Page) page));
-
- /*
- * Save pd_checksum and temporarily set it to zero, so that the checksum
- * calculation isn't affected by the old checksum stored on the page.
- * Restore it after, because actually updating the checksum is NOT part of
- * the API of this function.
- */
- save_checksum = cpage->phdr.pd_checksum;
- cpage->phdr.pd_checksum = 0;
- checksum = pg_checksum_block(cpage);
- cpage->phdr.pd_checksum = save_checksum;
-
- /* Mix in the block number to detect transposed pages */
- checksum ^= blkno;
-
- /*
- * Reduce to a uint16 (to fit in the pd_checksum field) with an offset of
- * one. That avoids checksums of zero, which seems like a good idea.
- */
- return (uint16) ((checksum % 65535) + 1);
-}
+#define PG_DECLARE_CHECKSUM_ISA(ISANAME) \
+uint32 \
+pg_checksum_block_##ISANAME(const PGChecksummablePage *page);
+
+#define PG_DEFINE_CHECKSUM_ISA(ISANAME) \
+pg_attribute_target(#ISANAME) \
+uint32 pg_checksum_block_##ISANAME(const PGChecksummablePage *page) \
+
+/* Declare ISA implementations (declarations only in header) */
+#ifdef USE_AVX2_WITH_RUNTIME_CHECK
+PG_DECLARE_CHECKSUM_ISA(avx2);
+#endif
+PG_DECLARE_CHECKSUM_ISA(default);
+
+uint32 pg_checksum_block_dispatch(const PGChecksummablePage *page);
+extern uint32 (*pg_checksum_block)(const PGChecksummablePage *page);
+extern uint16 pg_checksum_page(char *page, BlockNumber blkno);
diff --git a/src/port/Makefile b/src/port/Makefile
index 4274949dfa4..27423f1058b 100644
--- a/src/port/Makefile
+++ b/src/port/Makefile
@@ -48,6 +48,7 @@ OBJS = \
pg_numa.o \
pg_popcount_aarch64.o \
pg_popcount_avx512.o \
+ pg_checksum_dispatch.o \
pg_strong_random.o \
pgcheckdir.o \
pgmkdirp.o \
diff --git a/src/port/meson.build b/src/port/meson.build
index fc7b059fee5..c4bbe9f2ece 100644
--- a/src/port/meson.build
+++ b/src/port/meson.build
@@ -11,6 +11,7 @@ pgport_sources = [
'pg_numa.c',
'pg_popcount_aarch64.c',
'pg_popcount_avx512.c',
+ 'pg_checksum_dispatch.c',
'pg_strong_random.c',
'pgcheckdir.c',
'pgmkdirp.c',
diff --git a/src/port/pg_checksum_dispatch.c b/src/port/pg_checksum_dispatch.c
new file mode 100644
index 00000000000..15f7b8af34f
--- /dev/null
+++ b/src/port/pg_checksum_dispatch.c
@@ -0,0 +1,174 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_checksum_dispatch.c
+ * Holds the AVX2 pg_popcount() implementation.
+ *
+ * Copyright (c) 2024-2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/port/pg_checksum_dispatch.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "c.h"
+#include "storage/checksum_impl.h"
+
+#if defined(HAVE__GET_CPUID) || defined(HAVE__GET_CPUID_COUNT)
+#include <cpuid.h>
+#endif
+
+#ifdef HAVE_XSAVE_INTRINSICS
+#include <immintrin.h>
+#endif
+
+#if defined(HAVE__CPUID) || defined(HAVE__CPUIDEX)
+#include <intrin.h>
+#endif
+
+#include "port/pg_bitutils.h"
+
+/*
+ * Does CPUID say there's support for XSAVE instructions?
+ */
+static inline bool
+xsave_available(void)
+{
+ unsigned int exx[4] = {0, 0, 0, 0};
+
+#if defined(HAVE__GET_CPUID)
+ __get_cpuid(1, &exx[0], &exx[1], &exx[2], &exx[3]);
+#elif defined(HAVE__CPUID)
+ __cpuid(exx, 1);
+#elif defined(__x86_64__)
+#error cpuid instruction not available
+#endif
+ return (exx[2] & (1 << 27)) != 0; /* osxsave */
+}
+
+/*
+ * Does XGETBV say the YMM registers are enabled?
+ *
+ * NB: Caller is responsible for verifying that xsave_available() returns true
+ * before calling this.
+ */
+#ifdef HAVE_XSAVE_INTRINSICS
+pg_attribute_target("xsave")
+#endif
+static inline bool
+ymm_regs_available(void)
+{
+#ifdef HAVE_XSAVE_INTRINSICS
+ return (_xgetbv(0) & 0x06) == 0x06;
+#else
+ return false;
+#endif
+}
+
+/*
+ * Does CPUID say there's support for AVX-2
+ */
+static inline bool
+avx2_available(void)
+{
+#if defined (USE_AVX2_WITH_RUNTIME_CHECK) && defined(__x86_64__)
+ unsigned int exx[4] = {0, 0, 0, 0};
+ if (!xsave_available() || !ymm_regs_available()) return false;
+
+#if defined(HAVE__GET_CPUID_COUNT)
+ __get_cpuid_count(7, 0, &exx[0], &exx[1], &exx[2], &exx[3]);
+#elif defined(HAVE__CPUIDEX)
+ __cpuidex(exx, 7, 0);
+#else
+#error cpuid instruction not available
+#endif
+ return (exx[1] & (1 << 5)) != 0; /* avx2 */
+#else
+ return false;
+#endif
+}
+
+
+/* default checksum implementation */
+PG_DEFINE_CHECKSUM_ISA(default)
+{
+ uint32 sums[N_SUMS], result = 0;
+ uint32 i, j;
+
+ Assert(sizeof(PGChecksummablePage) == BLCKSZ);
+ memcpy(sums, checksumBaseOffsets, sizeof(checksumBaseOffsets));
+
+ for (i = 0; i < (uint32)(BLCKSZ / (sizeof(uint32) * N_SUMS)); i++)
+ for (j = 0; j < N_SUMS; j++)
+ CHECKSUM_COMP(sums[j], page->data[i][j]);
+
+ for (i = 0; i < 2; i++)
+ for (j = 0; j < N_SUMS; j++)
+ CHECKSUM_COMP(sums[j], 0);
+
+ for (i = 0; i < N_SUMS; i++)
+ result ^= sums[i];
+
+ return result;
+}
+
+#ifdef USE_AVX2_WITH_RUNTIME_CHECK
+PG_DEFINE_CHECKSUM_ISA(avx2)
+{
+ uint32 sums[N_SUMS], result = 0;
+ uint32 i, j;
+
+ Assert(sizeof(PGChecksummablePage) == BLCKSZ);
+ memcpy(sums, checksumBaseOffsets, sizeof(checksumBaseOffsets));
+
+ for (i = 0; i < (uint32)(BLCKSZ / (sizeof(uint32) * N_SUMS)); i++)
+ for (j = 0; j < N_SUMS; j++)
+ CHECKSUM_COMP(sums[j], page->data[i][j]);
+
+ for (i = 0; i < 2; i++)
+ for (j = 0; j < N_SUMS; j++)
+ CHECKSUM_COMP(sums[j], 0);
+
+ for (i = 0; i < N_SUMS; i++)
+ result ^= sums[i];
+
+ return result;
+}
+#endif
+
+/* Function pointer - external linkage (declared extern in header) */
+uint32 (*pg_checksum_block)(const PGChecksummablePage *page) = pg_checksum_block_dispatch;
+
+/* Dispatch function: simple, safe */
+uint32 pg_checksum_block_dispatch(const PGChecksummablePage *page)
+{
+#ifdef USE_AVX2_WITH_RUNTIME_CHECK
+ if (avx2_available())
+ {
+ /* optional: patch pointer so next call goes directly */
+ pg_checksum_block = pg_checksum_block_avx2;
+ return pg_checksum_block_avx2(page);
+ }
+#endif
+ /* fallback */
+ pg_checksum_block = pg_checksum_block_default;
+ return pg_checksum_block_default(page);
+}
+
+
+/* Compute checksum for a Postgres page */
+uint16 pg_checksum_page(char *page, BlockNumber blkno)
+{
+ PGChecksummablePage *cpage = (PGChecksummablePage *) page;
+ uint16 save_checksum;
+ uint32 checksum;
+
+ Assert(!PageIsNew((Page) page));
+
+ save_checksum = cpage->phdr.pd_checksum;
+ cpage->phdr.pd_checksum = 0;
+ checksum = pg_checksum_block(cpage);
+ cpage->phdr.pd_checksum = save_checksum;
+
+ checksum ^= blkno;
+ return (uint16)((checksum % 65535) + 1);
+}
--
2.43.0
From: Andrew kim <andrew.kim@intel.com>
---
contrib/meson.build | 1 +
contrib/pg_checksum_bench/meson.build | 23 +++++++++++++
.../pg_checksum_bench--1.0.sql | 8 +++++
contrib/pg_checksum_bench/pg_checksum_bench.c | 34 +++++++++++++++++++
.../pg_checksum_bench.control | 4 +++
.../sql/pg_checksum_bench.sql | 17 ++++++++++
6 files changed, 87 insertions(+)
create mode 100644 contrib/pg_checksum_bench/meson.build
create mode 100644 contrib/pg_checksum_bench/pg_checksum_bench--1.0.sql
create mode 100644 contrib/pg_checksum_bench/pg_checksum_bench.c
create mode 100644 contrib/pg_checksum_bench/pg_checksum_bench.control
create mode 100644 contrib/pg_checksum_bench/sql/pg_checksum_bench.sql
diff --git a/contrib/meson.build b/contrib/meson.build
index ed30ee7d639..fe5149aadff 100644
--- a/contrib/meson.build
+++ b/contrib/meson.build
@@ -12,6 +12,7 @@ contrib_doc_args = {
'install_dir': contrib_doc_dir,
}
+subdir('pg_checksum_bench')
subdir('amcheck')
subdir('auth_delay')
subdir('auto_explain')
diff --git a/contrib/pg_checksum_bench/meson.build b/contrib/pg_checksum_bench/meson.build
new file mode 100644
index 00000000000..32ccd9efa0f
--- /dev/null
+++ b/contrib/pg_checksum_bench/meson.build
@@ -0,0 +1,23 @@
+# Copyright (c) 2022-2025, PostgreSQL Global Development Group
+
+pg_checksum_bench_sources = files(
+ 'pg_checksum_bench.c',
+)
+
+if host_system == 'windows'
+ pg_checksum_bench_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'pg_checksum_bench',
+ '--FILEDESC', 'pg_checksum_bench',])
+endif
+
+pg_checksum_bench = shared_module('pg_checksum_bench',
+ pg_checksum_bench_sources,
+ kwargs: contrib_mod_args,
+)
+contrib_targets += pg_checksum_bench
+
+install_data(
+ 'pg_checksum_bench--1.0.sql',
+ 'pg_checksum_bench.control',
+ kwargs: contrib_data_args,
+)
diff --git a/contrib/pg_checksum_bench/pg_checksum_bench--1.0.sql
b/contrib/pg_checksum_bench/pg_checksum_bench--1.0.sql
new file mode 100644
index 00000000000..5f13cbe3c5e
--- /dev/null
+++ b/contrib/pg_checksum_bench/pg_checksum_bench--1.0.sql
@@ -0,0 +1,8 @@
+/* contrib/pg_checksum_bench/pg_checksum_bench--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+-- \echo Use "CREATE EXTENSION pg_checksum_bench" to load this file. \quit
+
+CREATE FUNCTION drive_pg_checksum(page_count int)
+ RETURNS pg_catalog.void
+ AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/contrib/pg_checksum_bench/pg_checksum_bench.c b/contrib/pg_checksum_bench/pg_checksum_bench.c
new file mode 100644
index 00000000000..f40f335ff59
--- /dev/null
+++ b/contrib/pg_checksum_bench/pg_checksum_bench.c
@@ -0,0 +1,34 @@
+#include "postgres.h"
+#include "fmgr.h"
+#include "storage/checksum_impl.h"
+
+#include <stdio.h>
+#include <assert.h>
+
+PG_MODULE_MAGIC;
+
+#define REPEATS 1000000
+
+PG_FUNCTION_INFO_V1(drive_pg_checksum);
+Datum
+drive_pg_checksum(PG_FUNCTION_ARGS)
+{
+ int page_count = PG_GETARG_INT32(0);
+
+ PGChecksummablePage * pages = palloc(page_count * sizeof(PGChecksummablePage));
+ srand(0);
+ for (size_t i = 0; i < page_count * sizeof(PGChecksummablePage); i++){
+ char * byte_ptr = (char *) pages;
+ byte_ptr[i] = rand() % 256;
+ }
+
+ for (int i = 0; i < REPEATS; i++){
+ const PGChecksummablePage * test_page = pages + (i % page_count);
+ volatile uint32 result = pg_checksum_block(test_page);
+ (void) result;
+ }
+
+ pfree((void *) pages);
+
+ PG_RETURN_VOID();
+}
diff --git a/contrib/pg_checksum_bench/pg_checksum_bench.control b/contrib/pg_checksum_bench/pg_checksum_bench.control
new file mode 100644
index 00000000000..4a4e2c9363c
--- /dev/null
+++ b/contrib/pg_checksum_bench/pg_checksum_bench.control
@@ -0,0 +1,4 @@
+comment = 'pg_checksum benchmark'
+default_version = '1.0'
+module_pathname = '$libdir/pg_checksum_bench'
+relocatable = true
diff --git a/contrib/pg_checksum_bench/sql/pg_checksum_bench.sql b/contrib/pg_checksum_bench/sql/pg_checksum_bench.sql
new file mode 100644
index 00000000000..4b347699953
--- /dev/null
+++ b/contrib/pg_checksum_bench/sql/pg_checksum_bench.sql
@@ -0,0 +1,17 @@
+CREATE EXTENSION pg_checksum_bench;
+
+SELECT drive_pg_checksum(-1);
+
+\timing on
+
+SELECT drive_pg_checksum(1);
+SELECT drive_pg_checksum(2);
+SELECT drive_pg_checksum(4);
+SELECT drive_pg_checksum(8);
+SELECT drive_pg_checksum(16);
+SELECT drive_pg_checksum(32);
+SELECT drive_pg_checksum(64);
+SELECT drive_pg_checksum(128);
+SELECT drive_pg_checksum(256);
+SELECT drive_pg_checksum(512);
+SELECT drive_pg_checksum(1024);
--
2.43.0
Re: Proposal for enabling auto-vectorization for checksum calculations
От
tenistarkim@gmail.com
Дата:
Hi John, Thanks for the feedback. This is v5 of the patchset, updated following your comments: - Moved the function pointer definitions out of common headers and into src/port, consistent with existing practice. Thanks again for the guidance. Best regards, Kim Andrew
From: Andrew Kim <andrew.kim@intel.com>
---
config/c-compiler.m4 | 31 +++++
configure | 52 +++++++++
configure.ac | 9 ++
meson.build | 28 +++++
src/include/pg_config.h.in | 3 +
src/include/storage/checksum_impl.h | 90 +++-----------
src/port/Makefile | 1 +
src/port/meson.build | 1 +
src/port/pg_checksum_dispatch.c | 174 ++++++++++++++++++++++++++++
9 files changed, 318 insertions(+), 71 deletions(-)
create mode 100644 src/port/pg_checksum_dispatch.c
diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
index da40bd6a647..5eb3218deb5 100644
--- a/config/c-compiler.m4
+++ b/config/c-compiler.m4
@@ -711,6 +711,37 @@ fi
undefine([Ac_cachevar])dnl
])# PGAC_XSAVE_INTRINSICS
+# PGAC_AVX2_SUPPORT
+# -----------------------------
+# Check if the compiler supports AVX2 in attribute((target))
+# and using AVX2 intrinsics in those functions
+#
+# If the intrinsics are supported, sets pgac_avx2_support.
+AC_DEFUN([PGAC_AVX2_SUPPORT],
+[define([Ac_cachevar], [AS_TR_SH([pgac_cv_avx2_support])])dnl
+AC_CACHE_CHECK([for AVX2 support], [Ac_cachevar],
+[AC_LINK_IFELSE([AC_LANG_PROGRAM([#include <immintrin.h>
+ #include <stdint.h>
+ #if defined(__has_attribute) && __has_attribute (target)
+ __attribute__((target("avx2")))
+ #endif
+ static int avx2_test(void)
+ {
+ const char buf@<:@sizeof(__m256i)@:>@;
+ __m256i accum = _mm256_loadu_si256((const __m256i *) buf);
+ accum = _mm256_add_epi32(accum, accum);
+ int result = _mm256_extract_epi32(accum, 0);
+ return (int) result;
+ }],
+ [return avx2_test();])],
+ [Ac_cachevar=yes],
+ [Ac_cachevar=no])])
+if test x"$Ac_cachevar" = x"yes"; then
+ pgac_avx2_support=yes
+fi
+undefine([Ac_cachevar])dnl
+])# PGAC_AVX2_SUPPORT
+
# PGAC_AVX512_POPCNT_INTRINSICS
# -----------------------------
# Check if the compiler supports the AVX-512 popcount instructions using the
diff --git a/configure b/configure
index 39c68161cec..54da05ac0db 100755
--- a/configure
+++ b/configure
@@ -17608,6 +17608,58 @@ $as_echo "#define HAVE_XSAVE_INTRINSICS 1" >>confdefs.h
fi
+# Check for AVX2 target and intrinsic support
+#
+if test x"$host_cpu" = x"x86_64"; then
+ { $as_echo "$as_me:${as_lineno-$LINENO}: checking for AVX2 support" >&5
+$as_echo_n "checking for AVX2 support... " >&6; }
+if ${pgac_cv_avx2_support+:} false; then :
+ $as_echo_n "(cached) " >&6
+else
+ cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h. */
+#include <immintrin.h>
+ #include <stdint.h>
+ #if defined(__has_attribute) && __has_attribute (target)
+ __attribute__((target("avx2")))
+ #endif
+ static int avx2_test(void)
+ {
+ const char buf[sizeof(__m256i)];
+ __m256i accum = _mm256_loadu_si256((const __m256i *) buf);
+ accum = _mm256_add_epi32(accum, accum);
+ int result = _mm256_extract_epi32(accum, 0);
+ return (int) result;
+ }
+int
+main ()
+{
+return avx2_test();
+ ;
+ return 0;
+}
+_ACEOF
+if ac_fn_c_try_link "$LINENO"; then :
+ pgac_cv_avx2_support=yes
+else
+ pgac_cv_avx2_support=no
+fi
+rm -f core conftest.err conftest.$ac_objext \
+ conftest$ac_exeext conftest.$ac_ext
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv_avx2_support" >&5
+$as_echo "$pgac_cv_avx2_support" >&6; }
+if test x"$pgac_cv_avx2_support" = x"yes"; then
+ pgac_avx2_support=yes
+fi
+
+ if test x"$pgac_avx2_support" = x"yes"; then
+
+$as_echo "#define USE_AVX2_WITH_RUNTIME_CHECK 1" >>confdefs.h
+
+ fi
+fi
+
# Check for AVX-512 popcount intrinsics
#
if test x"$host_cpu" = x"x86_64"; then
diff --git a/configure.ac b/configure.ac
index 066e3976c0a..2c484a12671 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2118,6 +2118,15 @@ if test x"$pgac_xsave_intrinsics" = x"yes"; then
AC_DEFINE(HAVE_XSAVE_INTRINSICS, 1, [Define to 1 if you have XSAVE intrinsics.])
fi
+# Check for AVX2 target and intrinsic support
+#
+if test x"$host_cpu" = x"x86_64"; then
+ PGAC_AVX2_SUPPORT()
+ if test x"$pgac_avx2_support" = x"yes"; then
+ AC_DEFINE(USE_AVX2_WITH_RUNTIME_CHECK, 1, [Define to 1 to use AVX2 instructions with a runtime check.])
+ fi
+fi
+
# Check for AVX-512 popcount intrinsics
#
if test x"$host_cpu" = x"x86_64"; then
diff --git a/meson.build b/meson.build
index ab8101d67b2..ff42c41ca7e 100644
--- a/meson.build
+++ b/meson.build
@@ -2289,6 +2289,34 @@ int main(void)
endif
+###############################################################
+# Check for the availability of AVX2 support
+###############################################################
+
+if host_cpu == 'x86_64'
+
+ prog = '''
+#include <immintrin.h>
+#include <stdint.h>
+#if defined(__has_attribute) && __has_attribute (target)
+__attribute__((target("avx2")))
+#endif
+int main(void)
+{
+ const char buf[sizeof(__m256i)];
+ __m256i accum = _mm256_loadu_si256((const __m256i *) buf);
+ accum = _mm256_add_epi32(accum, accum);
+ int result = _mm256_extract_epi32(accum, 0);
+ return (int) result;
+}
+'''
+
+ if cc.links(prog, name: 'AVX2 support', args: test_c_args)
+ cdata.set('USE_AVX2_WITH_RUNTIME_CHECK', 1)
+ endif
+
+endif
+
###############################################################
# Check for the availability of AVX-512 popcount intrinsics.
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index c4dc5d72bdb..987f9b5c77c 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -675,6 +675,9 @@
/* Define to 1 to use AVX-512 CRC algorithms with a runtime check. */
#undef USE_AVX512_CRC32C_WITH_RUNTIME_CHECK
+/* Define to 1 to use AVX2 instructions with a runtime check. */
+#undef USE_AVX2_WITH_RUNTIME_CHECK
+
/* Define to 1 to use AVX-512 popcount instructions with a runtime check. */
#undef USE_AVX512_POPCNT_WITH_RUNTIME_CHECK
diff --git a/src/include/storage/checksum_impl.h b/src/include/storage/checksum_impl.h
index da87d61ba52..82e525529f4 100644
--- a/src/include/storage/checksum_impl.h
+++ b/src/include/storage/checksum_impl.h
@@ -101,12 +101,14 @@
*/
#include "storage/bufpage.h"
+#include "pg_config.h"
/* number of checksums to calculate in parallel */
#define N_SUMS 32
/* prime multiplier of FNV-1a hash */
#define FNV_PRIME 16777619
+
/* Use a union so that this code is valid under strict aliasing */
typedef union
{
@@ -142,74 +144,20 @@ do { \
* Block checksum algorithm. The page must be adequately aligned
* (at least on 4-byte boundary).
*/
-static uint32
-pg_checksum_block(const PGChecksummablePage *page)
-{
- uint32 sums[N_SUMS];
- uint32 result = 0;
- uint32 i,
- j;
-
- /* ensure that the size is compatible with the algorithm */
- Assert(sizeof(PGChecksummablePage) == BLCKSZ);
-
- /* initialize partial checksums to their corresponding offsets */
- memcpy(sums, checksumBaseOffsets, sizeof(checksumBaseOffsets));
-
- /* main checksum calculation */
- for (i = 0; i < (uint32) (BLCKSZ / (sizeof(uint32) * N_SUMS)); i++)
- for (j = 0; j < N_SUMS; j++)
- CHECKSUM_COMP(sums[j], page->data[i][j]);
-
- /* finally add in two rounds of zeroes for additional mixing */
- for (i = 0; i < 2; i++)
- for (j = 0; j < N_SUMS; j++)
- CHECKSUM_COMP(sums[j], 0);
-
- /* xor fold partial checksums together */
- for (i = 0; i < N_SUMS; i++)
- result ^= sums[i];
-
- return result;
-}
-
-/*
- * Compute the checksum for a Postgres page.
- *
- * The page must be adequately aligned (at least on a 4-byte boundary).
- * Beware also that the checksum field of the page is transiently zeroed.
- *
- * The checksum includes the block number (to detect the case where a page is
- * somehow moved to a different location), the page header (excluding the
- * checksum itself), and the page data.
- */
-uint16
-pg_checksum_page(char *page, BlockNumber blkno)
-{
- PGChecksummablePage *cpage = (PGChecksummablePage *) page;
- uint16 save_checksum;
- uint32 checksum;
-
- /* We only calculate the checksum for properly-initialized pages */
- Assert(!PageIsNew((Page) page));
-
- /*
- * Save pd_checksum and temporarily set it to zero, so that the checksum
- * calculation isn't affected by the old checksum stored on the page.
- * Restore it after, because actually updating the checksum is NOT part of
- * the API of this function.
- */
- save_checksum = cpage->phdr.pd_checksum;
- cpage->phdr.pd_checksum = 0;
- checksum = pg_checksum_block(cpage);
- cpage->phdr.pd_checksum = save_checksum;
-
- /* Mix in the block number to detect transposed pages */
- checksum ^= blkno;
-
- /*
- * Reduce to a uint16 (to fit in the pd_checksum field) with an offset of
- * one. That avoids checksums of zero, which seems like a good idea.
- */
- return (uint16) ((checksum % 65535) + 1);
-}
+#define PG_DECLARE_CHECKSUM_ISA(ISANAME) \
+uint32 \
+pg_checksum_block_##ISANAME(const PGChecksummablePage *page);
+
+#define PG_DEFINE_CHECKSUM_ISA(ISANAME) \
+pg_attribute_target(#ISANAME) \
+uint32 pg_checksum_block_##ISANAME(const PGChecksummablePage *page) \
+
+/* Declare ISA implementations (declarations only in header) */
+#ifdef USE_AVX2_WITH_RUNTIME_CHECK
+PG_DECLARE_CHECKSUM_ISA(avx2);
+#endif
+PG_DECLARE_CHECKSUM_ISA(default);
+
+uint32 pg_checksum_block_dispatch(const PGChecksummablePage *page);
+extern uint32 (*pg_checksum_block)(const PGChecksummablePage *page);
+extern uint16 pg_checksum_page(char *page, BlockNumber blkno);
diff --git a/src/port/Makefile b/src/port/Makefile
index 4274949dfa4..27423f1058b 100644
--- a/src/port/Makefile
+++ b/src/port/Makefile
@@ -48,6 +48,7 @@ OBJS = \
pg_numa.o \
pg_popcount_aarch64.o \
pg_popcount_avx512.o \
+ pg_checksum_dispatch.o \
pg_strong_random.o \
pgcheckdir.o \
pgmkdirp.o \
diff --git a/src/port/meson.build b/src/port/meson.build
index fc7b059fee5..c4bbe9f2ece 100644
--- a/src/port/meson.build
+++ b/src/port/meson.build
@@ -11,6 +11,7 @@ pgport_sources = [
'pg_numa.c',
'pg_popcount_aarch64.c',
'pg_popcount_avx512.c',
+ 'pg_checksum_dispatch.c',
'pg_strong_random.c',
'pgcheckdir.c',
'pgmkdirp.c',
diff --git a/src/port/pg_checksum_dispatch.c b/src/port/pg_checksum_dispatch.c
new file mode 100644
index 00000000000..15f7b8af34f
--- /dev/null
+++ b/src/port/pg_checksum_dispatch.c
@@ -0,0 +1,174 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_checksum_dispatch.c
+ * Holds the AVX2 pg_popcount() implementation.
+ *
+ * Copyright (c) 2024-2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/port/pg_checksum_dispatch.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "c.h"
+#include "storage/checksum_impl.h"
+
+#if defined(HAVE__GET_CPUID) || defined(HAVE__GET_CPUID_COUNT)
+#include <cpuid.h>
+#endif
+
+#ifdef HAVE_XSAVE_INTRINSICS
+#include <immintrin.h>
+#endif
+
+#if defined(HAVE__CPUID) || defined(HAVE__CPUIDEX)
+#include <intrin.h>
+#endif
+
+#include "port/pg_bitutils.h"
+
+/*
+ * Does CPUID say there's support for XSAVE instructions?
+ */
+static inline bool
+xsave_available(void)
+{
+ unsigned int exx[4] = {0, 0, 0, 0};
+
+#if defined(HAVE__GET_CPUID)
+ __get_cpuid(1, &exx[0], &exx[1], &exx[2], &exx[3]);
+#elif defined(HAVE__CPUID)
+ __cpuid(exx, 1);
+#elif defined(__x86_64__)
+#error cpuid instruction not available
+#endif
+ return (exx[2] & (1 << 27)) != 0; /* osxsave */
+}
+
+/*
+ * Does XGETBV say the YMM registers are enabled?
+ *
+ * NB: Caller is responsible for verifying that xsave_available() returns true
+ * before calling this.
+ */
+#ifdef HAVE_XSAVE_INTRINSICS
+pg_attribute_target("xsave")
+#endif
+static inline bool
+ymm_regs_available(void)
+{
+#ifdef HAVE_XSAVE_INTRINSICS
+ return (_xgetbv(0) & 0x06) == 0x06;
+#else
+ return false;
+#endif
+}
+
+/*
+ * Does CPUID say there's support for AVX-2
+ */
+static inline bool
+avx2_available(void)
+{
+#if defined (USE_AVX2_WITH_RUNTIME_CHECK) && defined(__x86_64__)
+ unsigned int exx[4] = {0, 0, 0, 0};
+ if (!xsave_available() || !ymm_regs_available()) return false;
+
+#if defined(HAVE__GET_CPUID_COUNT)
+ __get_cpuid_count(7, 0, &exx[0], &exx[1], &exx[2], &exx[3]);
+#elif defined(HAVE__CPUIDEX)
+ __cpuidex(exx, 7, 0);
+#else
+#error cpuid instruction not available
+#endif
+ return (exx[1] & (1 << 5)) != 0; /* avx2 */
+#else
+ return false;
+#endif
+}
+
+
+/* default checksum implementation */
+PG_DEFINE_CHECKSUM_ISA(default)
+{
+ uint32 sums[N_SUMS], result = 0;
+ uint32 i, j;
+
+ Assert(sizeof(PGChecksummablePage) == BLCKSZ);
+ memcpy(sums, checksumBaseOffsets, sizeof(checksumBaseOffsets));
+
+ for (i = 0; i < (uint32)(BLCKSZ / (sizeof(uint32) * N_SUMS)); i++)
+ for (j = 0; j < N_SUMS; j++)
+ CHECKSUM_COMP(sums[j], page->data[i][j]);
+
+ for (i = 0; i < 2; i++)
+ for (j = 0; j < N_SUMS; j++)
+ CHECKSUM_COMP(sums[j], 0);
+
+ for (i = 0; i < N_SUMS; i++)
+ result ^= sums[i];
+
+ return result;
+}
+
+#ifdef USE_AVX2_WITH_RUNTIME_CHECK
+PG_DEFINE_CHECKSUM_ISA(avx2)
+{
+ uint32 sums[N_SUMS], result = 0;
+ uint32 i, j;
+
+ Assert(sizeof(PGChecksummablePage) == BLCKSZ);
+ memcpy(sums, checksumBaseOffsets, sizeof(checksumBaseOffsets));
+
+ for (i = 0; i < (uint32)(BLCKSZ / (sizeof(uint32) * N_SUMS)); i++)
+ for (j = 0; j < N_SUMS; j++)
+ CHECKSUM_COMP(sums[j], page->data[i][j]);
+
+ for (i = 0; i < 2; i++)
+ for (j = 0; j < N_SUMS; j++)
+ CHECKSUM_COMP(sums[j], 0);
+
+ for (i = 0; i < N_SUMS; i++)
+ result ^= sums[i];
+
+ return result;
+}
+#endif
+
+/* Function pointer - external linkage (declared extern in header) */
+uint32 (*pg_checksum_block)(const PGChecksummablePage *page) = pg_checksum_block_dispatch;
+
+/* Dispatch function: simple, safe */
+uint32 pg_checksum_block_dispatch(const PGChecksummablePage *page)
+{
+#ifdef USE_AVX2_WITH_RUNTIME_CHECK
+ if (avx2_available())
+ {
+ /* optional: patch pointer so next call goes directly */
+ pg_checksum_block = pg_checksum_block_avx2;
+ return pg_checksum_block_avx2(page);
+ }
+#endif
+ /* fallback */
+ pg_checksum_block = pg_checksum_block_default;
+ return pg_checksum_block_default(page);
+}
+
+
+/* Compute checksum for a Postgres page */
+uint16 pg_checksum_page(char *page, BlockNumber blkno)
+{
+ PGChecksummablePage *cpage = (PGChecksummablePage *) page;
+ uint16 save_checksum;
+ uint32 checksum;
+
+ Assert(!PageIsNew((Page) page));
+
+ save_checksum = cpage->phdr.pd_checksum;
+ cpage->phdr.pd_checksum = 0;
+ checksum = pg_checksum_block(cpage);
+ cpage->phdr.pd_checksum = save_checksum;
+
+ checksum ^= blkno;
+ return (uint16)((checksum % 65535) + 1);
+}
--
2.43.0
From: Andrew kim <andrew.kim@intel.com>
---
contrib/meson.build | 1 +
contrib/pg_checksum_bench/meson.build | 23 +++++++++++++
.../pg_checksum_bench--1.0.sql | 8 +++++
contrib/pg_checksum_bench/pg_checksum_bench.c | 34 +++++++++++++++++++
.../pg_checksum_bench.control | 4 +++
.../sql/pg_checksum_bench.sql | 17 ++++++++++
6 files changed, 87 insertions(+)
create mode 100644 contrib/pg_checksum_bench/meson.build
create mode 100644 contrib/pg_checksum_bench/pg_checksum_bench--1.0.sql
create mode 100644 contrib/pg_checksum_bench/pg_checksum_bench.c
create mode 100644 contrib/pg_checksum_bench/pg_checksum_bench.control
create mode 100644 contrib/pg_checksum_bench/sql/pg_checksum_bench.sql
diff --git a/contrib/meson.build b/contrib/meson.build
index ed30ee7d639..fe5149aadff 100644
--- a/contrib/meson.build
+++ b/contrib/meson.build
@@ -12,6 +12,7 @@ contrib_doc_args = {
'install_dir': contrib_doc_dir,
}
+subdir('pg_checksum_bench')
subdir('amcheck')
subdir('auth_delay')
subdir('auto_explain')
diff --git a/contrib/pg_checksum_bench/meson.build b/contrib/pg_checksum_bench/meson.build
new file mode 100644
index 00000000000..32ccd9efa0f
--- /dev/null
+++ b/contrib/pg_checksum_bench/meson.build
@@ -0,0 +1,23 @@
+# Copyright (c) 2022-2025, PostgreSQL Global Development Group
+
+pg_checksum_bench_sources = files(
+ 'pg_checksum_bench.c',
+)
+
+if host_system == 'windows'
+ pg_checksum_bench_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'pg_checksum_bench',
+ '--FILEDESC', 'pg_checksum_bench',])
+endif
+
+pg_checksum_bench = shared_module('pg_checksum_bench',
+ pg_checksum_bench_sources,
+ kwargs: contrib_mod_args,
+)
+contrib_targets += pg_checksum_bench
+
+install_data(
+ 'pg_checksum_bench--1.0.sql',
+ 'pg_checksum_bench.control',
+ kwargs: contrib_data_args,
+)
diff --git a/contrib/pg_checksum_bench/pg_checksum_bench--1.0.sql
b/contrib/pg_checksum_bench/pg_checksum_bench--1.0.sql
new file mode 100644
index 00000000000..5f13cbe3c5e
--- /dev/null
+++ b/contrib/pg_checksum_bench/pg_checksum_bench--1.0.sql
@@ -0,0 +1,8 @@
+/* contrib/pg_checksum_bench/pg_checksum_bench--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+-- \echo Use "CREATE EXTENSION pg_checksum_bench" to load this file. \quit
+
+CREATE FUNCTION drive_pg_checksum(page_count int)
+ RETURNS pg_catalog.void
+ AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/contrib/pg_checksum_bench/pg_checksum_bench.c b/contrib/pg_checksum_bench/pg_checksum_bench.c
new file mode 100644
index 00000000000..f40f335ff59
--- /dev/null
+++ b/contrib/pg_checksum_bench/pg_checksum_bench.c
@@ -0,0 +1,34 @@
+#include "postgres.h"
+#include "fmgr.h"
+#include "storage/checksum_impl.h"
+
+#include <stdio.h>
+#include <assert.h>
+
+PG_MODULE_MAGIC;
+
+#define REPEATS 1000000
+
+PG_FUNCTION_INFO_V1(drive_pg_checksum);
+Datum
+drive_pg_checksum(PG_FUNCTION_ARGS)
+{
+ int page_count = PG_GETARG_INT32(0);
+
+ PGChecksummablePage * pages = palloc(page_count * sizeof(PGChecksummablePage));
+ srand(0);
+ for (size_t i = 0; i < page_count * sizeof(PGChecksummablePage); i++){
+ char * byte_ptr = (char *) pages;
+ byte_ptr[i] = rand() % 256;
+ }
+
+ for (int i = 0; i < REPEATS; i++){
+ const PGChecksummablePage * test_page = pages + (i % page_count);
+ volatile uint32 result = pg_checksum_block(test_page);
+ (void) result;
+ }
+
+ pfree((void *) pages);
+
+ PG_RETURN_VOID();
+}
diff --git a/contrib/pg_checksum_bench/pg_checksum_bench.control b/contrib/pg_checksum_bench/pg_checksum_bench.control
new file mode 100644
index 00000000000..4a4e2c9363c
--- /dev/null
+++ b/contrib/pg_checksum_bench/pg_checksum_bench.control
@@ -0,0 +1,4 @@
+comment = 'pg_checksum benchmark'
+default_version = '1.0'
+module_pathname = '$libdir/pg_checksum_bench'
+relocatable = true
diff --git a/contrib/pg_checksum_bench/sql/pg_checksum_bench.sql b/contrib/pg_checksum_bench/sql/pg_checksum_bench.sql
new file mode 100644
index 00000000000..4b347699953
--- /dev/null
+++ b/contrib/pg_checksum_bench/sql/pg_checksum_bench.sql
@@ -0,0 +1,17 @@
+CREATE EXTENSION pg_checksum_bench;
+
+SELECT drive_pg_checksum(-1);
+
+\timing on
+
+SELECT drive_pg_checksum(1);
+SELECT drive_pg_checksum(2);
+SELECT drive_pg_checksum(4);
+SELECT drive_pg_checksum(8);
+SELECT drive_pg_checksum(16);
+SELECT drive_pg_checksum(32);
+SELECT drive_pg_checksum(64);
+SELECT drive_pg_checksum(128);
+SELECT drive_pg_checksum(256);
+SELECT drive_pg_checksum(512);
+SELECT drive_pg_checksum(1024);
--
2.43.0
Re: Proposal for enabling auto-vectorization for checksum calculations
От
tenistarkim@gmail.com
Дата:
Hi John, Thanks for the feedback. This is v5 of the patchset, updated following your comments: - Moved the function pointer definitions out of common headers and into src/port, consistent with existing practice. Thanks again for the guidance. Best regards, Kim Andrew
From: Andrew Kim <andrew.kim@intel.com>
---
config/c-compiler.m4 | 31 +++++
configure | 52 +++++++++
configure.ac | 9 ++
meson.build | 28 +++++
src/include/pg_config.h.in | 3 +
src/include/storage/checksum_impl.h | 90 +++-----------
src/port/Makefile | 1 +
src/port/meson.build | 1 +
src/port/pg_checksum_dispatch.c | 174 ++++++++++++++++++++++++++++
9 files changed, 318 insertions(+), 71 deletions(-)
create mode 100644 src/port/pg_checksum_dispatch.c
diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
index da40bd6a647..5eb3218deb5 100644
--- a/config/c-compiler.m4
+++ b/config/c-compiler.m4
@@ -711,6 +711,37 @@ fi
undefine([Ac_cachevar])dnl
])# PGAC_XSAVE_INTRINSICS
+# PGAC_AVX2_SUPPORT
+# -----------------------------
+# Check if the compiler supports AVX2 in attribute((target))
+# and using AVX2 intrinsics in those functions
+#
+# If the intrinsics are supported, sets pgac_avx2_support.
+AC_DEFUN([PGAC_AVX2_SUPPORT],
+[define([Ac_cachevar], [AS_TR_SH([pgac_cv_avx2_support])])dnl
+AC_CACHE_CHECK([for AVX2 support], [Ac_cachevar],
+[AC_LINK_IFELSE([AC_LANG_PROGRAM([#include <immintrin.h>
+ #include <stdint.h>
+ #if defined(__has_attribute) && __has_attribute (target)
+ __attribute__((target("avx2")))
+ #endif
+ static int avx2_test(void)
+ {
+ const char buf@<:@sizeof(__m256i)@:>@;
+ __m256i accum = _mm256_loadu_si256((const __m256i *) buf);
+ accum = _mm256_add_epi32(accum, accum);
+ int result = _mm256_extract_epi32(accum, 0);
+ return (int) result;
+ }],
+ [return avx2_test();])],
+ [Ac_cachevar=yes],
+ [Ac_cachevar=no])])
+if test x"$Ac_cachevar" = x"yes"; then
+ pgac_avx2_support=yes
+fi
+undefine([Ac_cachevar])dnl
+])# PGAC_AVX2_SUPPORT
+
# PGAC_AVX512_POPCNT_INTRINSICS
# -----------------------------
# Check if the compiler supports the AVX-512 popcount instructions using the
diff --git a/configure b/configure
index 39c68161cec..54da05ac0db 100755
--- a/configure
+++ b/configure
@@ -17608,6 +17608,58 @@ $as_echo "#define HAVE_XSAVE_INTRINSICS 1" >>confdefs.h
fi
+# Check for AVX2 target and intrinsic support
+#
+if test x"$host_cpu" = x"x86_64"; then
+ { $as_echo "$as_me:${as_lineno-$LINENO}: checking for AVX2 support" >&5
+$as_echo_n "checking for AVX2 support... " >&6; }
+if ${pgac_cv_avx2_support+:} false; then :
+ $as_echo_n "(cached) " >&6
+else
+ cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h. */
+#include <immintrin.h>
+ #include <stdint.h>
+ #if defined(__has_attribute) && __has_attribute (target)
+ __attribute__((target("avx2")))
+ #endif
+ static int avx2_test(void)
+ {
+ const char buf[sizeof(__m256i)];
+ __m256i accum = _mm256_loadu_si256((const __m256i *) buf);
+ accum = _mm256_add_epi32(accum, accum);
+ int result = _mm256_extract_epi32(accum, 0);
+ return (int) result;
+ }
+int
+main ()
+{
+return avx2_test();
+ ;
+ return 0;
+}
+_ACEOF
+if ac_fn_c_try_link "$LINENO"; then :
+ pgac_cv_avx2_support=yes
+else
+ pgac_cv_avx2_support=no
+fi
+rm -f core conftest.err conftest.$ac_objext \
+ conftest$ac_exeext conftest.$ac_ext
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv_avx2_support" >&5
+$as_echo "$pgac_cv_avx2_support" >&6; }
+if test x"$pgac_cv_avx2_support" = x"yes"; then
+ pgac_avx2_support=yes
+fi
+
+ if test x"$pgac_avx2_support" = x"yes"; then
+
+$as_echo "#define USE_AVX2_WITH_RUNTIME_CHECK 1" >>confdefs.h
+
+ fi
+fi
+
# Check for AVX-512 popcount intrinsics
#
if test x"$host_cpu" = x"x86_64"; then
diff --git a/configure.ac b/configure.ac
index 066e3976c0a..2c484a12671 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2118,6 +2118,15 @@ if test x"$pgac_xsave_intrinsics" = x"yes"; then
AC_DEFINE(HAVE_XSAVE_INTRINSICS, 1, [Define to 1 if you have XSAVE intrinsics.])
fi
+# Check for AVX2 target and intrinsic support
+#
+if test x"$host_cpu" = x"x86_64"; then
+ PGAC_AVX2_SUPPORT()
+ if test x"$pgac_avx2_support" = x"yes"; then
+ AC_DEFINE(USE_AVX2_WITH_RUNTIME_CHECK, 1, [Define to 1 to use AVX2 instructions with a runtime check.])
+ fi
+fi
+
# Check for AVX-512 popcount intrinsics
#
if test x"$host_cpu" = x"x86_64"; then
diff --git a/meson.build b/meson.build
index ab8101d67b2..ff42c41ca7e 100644
--- a/meson.build
+++ b/meson.build
@@ -2289,6 +2289,34 @@ int main(void)
endif
+###############################################################
+# Check for the availability of AVX2 support
+###############################################################
+
+if host_cpu == 'x86_64'
+
+ prog = '''
+#include <immintrin.h>
+#include <stdint.h>
+#if defined(__has_attribute) && __has_attribute (target)
+__attribute__((target("avx2")))
+#endif
+int main(void)
+{
+ const char buf[sizeof(__m256i)];
+ __m256i accum = _mm256_loadu_si256((const __m256i *) buf);
+ accum = _mm256_add_epi32(accum, accum);
+ int result = _mm256_extract_epi32(accum, 0);
+ return (int) result;
+}
+'''
+
+ if cc.links(prog, name: 'AVX2 support', args: test_c_args)
+ cdata.set('USE_AVX2_WITH_RUNTIME_CHECK', 1)
+ endif
+
+endif
+
###############################################################
# Check for the availability of AVX-512 popcount intrinsics.
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index c4dc5d72bdb..987f9b5c77c 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -675,6 +675,9 @@
/* Define to 1 to use AVX-512 CRC algorithms with a runtime check. */
#undef USE_AVX512_CRC32C_WITH_RUNTIME_CHECK
+/* Define to 1 to use AVX2 instructions with a runtime check. */
+#undef USE_AVX2_WITH_RUNTIME_CHECK
+
/* Define to 1 to use AVX-512 popcount instructions with a runtime check. */
#undef USE_AVX512_POPCNT_WITH_RUNTIME_CHECK
diff --git a/src/include/storage/checksum_impl.h b/src/include/storage/checksum_impl.h
index da87d61ba52..82e525529f4 100644
--- a/src/include/storage/checksum_impl.h
+++ b/src/include/storage/checksum_impl.h
@@ -101,12 +101,14 @@
*/
#include "storage/bufpage.h"
+#include "pg_config.h"
/* number of checksums to calculate in parallel */
#define N_SUMS 32
/* prime multiplier of FNV-1a hash */
#define FNV_PRIME 16777619
+
/* Use a union so that this code is valid under strict aliasing */
typedef union
{
@@ -142,74 +144,20 @@ do { \
* Block checksum algorithm. The page must be adequately aligned
* (at least on 4-byte boundary).
*/
-static uint32
-pg_checksum_block(const PGChecksummablePage *page)
-{
- uint32 sums[N_SUMS];
- uint32 result = 0;
- uint32 i,
- j;
-
- /* ensure that the size is compatible with the algorithm */
- Assert(sizeof(PGChecksummablePage) == BLCKSZ);
-
- /* initialize partial checksums to their corresponding offsets */
- memcpy(sums, checksumBaseOffsets, sizeof(checksumBaseOffsets));
-
- /* main checksum calculation */
- for (i = 0; i < (uint32) (BLCKSZ / (sizeof(uint32) * N_SUMS)); i++)
- for (j = 0; j < N_SUMS; j++)
- CHECKSUM_COMP(sums[j], page->data[i][j]);
-
- /* finally add in two rounds of zeroes for additional mixing */
- for (i = 0; i < 2; i++)
- for (j = 0; j < N_SUMS; j++)
- CHECKSUM_COMP(sums[j], 0);
-
- /* xor fold partial checksums together */
- for (i = 0; i < N_SUMS; i++)
- result ^= sums[i];
-
- return result;
-}
-
-/*
- * Compute the checksum for a Postgres page.
- *
- * The page must be adequately aligned (at least on a 4-byte boundary).
- * Beware also that the checksum field of the page is transiently zeroed.
- *
- * The checksum includes the block number (to detect the case where a page is
- * somehow moved to a different location), the page header (excluding the
- * checksum itself), and the page data.
- */
-uint16
-pg_checksum_page(char *page, BlockNumber blkno)
-{
- PGChecksummablePage *cpage = (PGChecksummablePage *) page;
- uint16 save_checksum;
- uint32 checksum;
-
- /* We only calculate the checksum for properly-initialized pages */
- Assert(!PageIsNew((Page) page));
-
- /*
- * Save pd_checksum and temporarily set it to zero, so that the checksum
- * calculation isn't affected by the old checksum stored on the page.
- * Restore it after, because actually updating the checksum is NOT part of
- * the API of this function.
- */
- save_checksum = cpage->phdr.pd_checksum;
- cpage->phdr.pd_checksum = 0;
- checksum = pg_checksum_block(cpage);
- cpage->phdr.pd_checksum = save_checksum;
-
- /* Mix in the block number to detect transposed pages */
- checksum ^= blkno;
-
- /*
- * Reduce to a uint16 (to fit in the pd_checksum field) with an offset of
- * one. That avoids checksums of zero, which seems like a good idea.
- */
- return (uint16) ((checksum % 65535) + 1);
-}
+#define PG_DECLARE_CHECKSUM_ISA(ISANAME) \
+uint32 \
+pg_checksum_block_##ISANAME(const PGChecksummablePage *page);
+
+#define PG_DEFINE_CHECKSUM_ISA(ISANAME) \
+pg_attribute_target(#ISANAME) \
+uint32 pg_checksum_block_##ISANAME(const PGChecksummablePage *page) \
+
+/* Declare ISA implementations (declarations only in header) */
+#ifdef USE_AVX2_WITH_RUNTIME_CHECK
+PG_DECLARE_CHECKSUM_ISA(avx2);
+#endif
+PG_DECLARE_CHECKSUM_ISA(default);
+
+uint32 pg_checksum_block_dispatch(const PGChecksummablePage *page);
+extern uint32 (*pg_checksum_block)(const PGChecksummablePage *page);
+extern uint16 pg_checksum_page(char *page, BlockNumber blkno);
diff --git a/src/port/Makefile b/src/port/Makefile
index 4274949dfa4..27423f1058b 100644
--- a/src/port/Makefile
+++ b/src/port/Makefile
@@ -48,6 +48,7 @@ OBJS = \
pg_numa.o \
pg_popcount_aarch64.o \
pg_popcount_avx512.o \
+ pg_checksum_dispatch.o \
pg_strong_random.o \
pgcheckdir.o \
pgmkdirp.o \
diff --git a/src/port/meson.build b/src/port/meson.build
index fc7b059fee5..c4bbe9f2ece 100644
--- a/src/port/meson.build
+++ b/src/port/meson.build
@@ -11,6 +11,7 @@ pgport_sources = [
'pg_numa.c',
'pg_popcount_aarch64.c',
'pg_popcount_avx512.c',
+ 'pg_checksum_dispatch.c',
'pg_strong_random.c',
'pgcheckdir.c',
'pgmkdirp.c',
diff --git a/src/port/pg_checksum_dispatch.c b/src/port/pg_checksum_dispatch.c
new file mode 100644
index 00000000000..15f7b8af34f
--- /dev/null
+++ b/src/port/pg_checksum_dispatch.c
@@ -0,0 +1,174 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_checksum_dispatch.c
+ * Holds the AVX2 pg_popcount() implementation.
+ *
+ * Copyright (c) 2024-2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/port/pg_checksum_dispatch.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "c.h"
+#include "storage/checksum_impl.h"
+
+#if defined(HAVE__GET_CPUID) || defined(HAVE__GET_CPUID_COUNT)
+#include <cpuid.h>
+#endif
+
+#ifdef HAVE_XSAVE_INTRINSICS
+#include <immintrin.h>
+#endif
+
+#if defined(HAVE__CPUID) || defined(HAVE__CPUIDEX)
+#include <intrin.h>
+#endif
+
+#include "port/pg_bitutils.h"
+
+/*
+ * Does CPUID say there's support for XSAVE instructions?
+ */
+static inline bool
+xsave_available(void)
+{
+ unsigned int exx[4] = {0, 0, 0, 0};
+
+#if defined(HAVE__GET_CPUID)
+ __get_cpuid(1, &exx[0], &exx[1], &exx[2], &exx[3]);
+#elif defined(HAVE__CPUID)
+ __cpuid(exx, 1);
+#elif defined(__x86_64__)
+#error cpuid instruction not available
+#endif
+ return (exx[2] & (1 << 27)) != 0; /* osxsave */
+}
+
+/*
+ * Does XGETBV say the YMM registers are enabled?
+ *
+ * NB: Caller is responsible for verifying that xsave_available() returns true
+ * before calling this.
+ */
+#ifdef HAVE_XSAVE_INTRINSICS
+pg_attribute_target("xsave")
+#endif
+static inline bool
+ymm_regs_available(void)
+{
+#ifdef HAVE_XSAVE_INTRINSICS
+ return (_xgetbv(0) & 0x06) == 0x06;
+#else
+ return false;
+#endif
+}
+
+/*
+ * Does CPUID say there's support for AVX-2
+ */
+static inline bool
+avx2_available(void)
+{
+#if defined (USE_AVX2_WITH_RUNTIME_CHECK) && defined(__x86_64__)
+ unsigned int exx[4] = {0, 0, 0, 0};
+ if (!xsave_available() || !ymm_regs_available()) return false;
+
+#if defined(HAVE__GET_CPUID_COUNT)
+ __get_cpuid_count(7, 0, &exx[0], &exx[1], &exx[2], &exx[3]);
+#elif defined(HAVE__CPUIDEX)
+ __cpuidex(exx, 7, 0);
+#else
+#error cpuid instruction not available
+#endif
+ return (exx[1] & (1 << 5)) != 0; /* avx2 */
+#else
+ return false;
+#endif
+}
+
+
+/* default checksum implementation */
+PG_DEFINE_CHECKSUM_ISA(default)
+{
+ uint32 sums[N_SUMS], result = 0;
+ uint32 i, j;
+
+ Assert(sizeof(PGChecksummablePage) == BLCKSZ);
+ memcpy(sums, checksumBaseOffsets, sizeof(checksumBaseOffsets));
+
+ for (i = 0; i < (uint32)(BLCKSZ / (sizeof(uint32) * N_SUMS)); i++)
+ for (j = 0; j < N_SUMS; j++)
+ CHECKSUM_COMP(sums[j], page->data[i][j]);
+
+ for (i = 0; i < 2; i++)
+ for (j = 0; j < N_SUMS; j++)
+ CHECKSUM_COMP(sums[j], 0);
+
+ for (i = 0; i < N_SUMS; i++)
+ result ^= sums[i];
+
+ return result;
+}
+
+#ifdef USE_AVX2_WITH_RUNTIME_CHECK
+PG_DEFINE_CHECKSUM_ISA(avx2)
+{
+ uint32 sums[N_SUMS], result = 0;
+ uint32 i, j;
+
+ Assert(sizeof(PGChecksummablePage) == BLCKSZ);
+ memcpy(sums, checksumBaseOffsets, sizeof(checksumBaseOffsets));
+
+ for (i = 0; i < (uint32)(BLCKSZ / (sizeof(uint32) * N_SUMS)); i++)
+ for (j = 0; j < N_SUMS; j++)
+ CHECKSUM_COMP(sums[j], page->data[i][j]);
+
+ for (i = 0; i < 2; i++)
+ for (j = 0; j < N_SUMS; j++)
+ CHECKSUM_COMP(sums[j], 0);
+
+ for (i = 0; i < N_SUMS; i++)
+ result ^= sums[i];
+
+ return result;
+}
+#endif
+
+/* Function pointer - external linkage (declared extern in header) */
+uint32 (*pg_checksum_block)(const PGChecksummablePage *page) = pg_checksum_block_dispatch;
+
+/* Dispatch function: simple, safe */
+uint32 pg_checksum_block_dispatch(const PGChecksummablePage *page)
+{
+#ifdef USE_AVX2_WITH_RUNTIME_CHECK
+ if (avx2_available())
+ {
+ /* optional: patch pointer so next call goes directly */
+ pg_checksum_block = pg_checksum_block_avx2;
+ return pg_checksum_block_avx2(page);
+ }
+#endif
+ /* fallback */
+ pg_checksum_block = pg_checksum_block_default;
+ return pg_checksum_block_default(page);
+}
+
+
+/* Compute checksum for a Postgres page */
+uint16 pg_checksum_page(char *page, BlockNumber blkno)
+{
+ PGChecksummablePage *cpage = (PGChecksummablePage *) page;
+ uint16 save_checksum;
+ uint32 checksum;
+
+ Assert(!PageIsNew((Page) page));
+
+ save_checksum = cpage->phdr.pd_checksum;
+ cpage->phdr.pd_checksum = 0;
+ checksum = pg_checksum_block(cpage);
+ cpage->phdr.pd_checksum = save_checksum;
+
+ checksum ^= blkno;
+ return (uint16)((checksum % 65535) + 1);
+}
--
2.43.0
From: Andrew kim <andrew.kim@intel.com>
---
contrib/meson.build | 1 +
contrib/pg_checksum_bench/meson.build | 23 +++++++++++++
.../pg_checksum_bench--1.0.sql | 8 +++++
contrib/pg_checksum_bench/pg_checksum_bench.c | 34 +++++++++++++++++++
.../pg_checksum_bench.control | 4 +++
.../sql/pg_checksum_bench.sql | 17 ++++++++++
6 files changed, 87 insertions(+)
create mode 100644 contrib/pg_checksum_bench/meson.build
create mode 100644 contrib/pg_checksum_bench/pg_checksum_bench--1.0.sql
create mode 100644 contrib/pg_checksum_bench/pg_checksum_bench.c
create mode 100644 contrib/pg_checksum_bench/pg_checksum_bench.control
create mode 100644 contrib/pg_checksum_bench/sql/pg_checksum_bench.sql
diff --git a/contrib/meson.build b/contrib/meson.build
index ed30ee7d639..fe5149aadff 100644
--- a/contrib/meson.build
+++ b/contrib/meson.build
@@ -12,6 +12,7 @@ contrib_doc_args = {
'install_dir': contrib_doc_dir,
}
+subdir('pg_checksum_bench')
subdir('amcheck')
subdir('auth_delay')
subdir('auto_explain')
diff --git a/contrib/pg_checksum_bench/meson.build b/contrib/pg_checksum_bench/meson.build
new file mode 100644
index 00000000000..32ccd9efa0f
--- /dev/null
+++ b/contrib/pg_checksum_bench/meson.build
@@ -0,0 +1,23 @@
+# Copyright (c) 2022-2025, PostgreSQL Global Development Group
+
+pg_checksum_bench_sources = files(
+ 'pg_checksum_bench.c',
+)
+
+if host_system == 'windows'
+ pg_checksum_bench_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'pg_checksum_bench',
+ '--FILEDESC', 'pg_checksum_bench',])
+endif
+
+pg_checksum_bench = shared_module('pg_checksum_bench',
+ pg_checksum_bench_sources,
+ kwargs: contrib_mod_args,
+)
+contrib_targets += pg_checksum_bench
+
+install_data(
+ 'pg_checksum_bench--1.0.sql',
+ 'pg_checksum_bench.control',
+ kwargs: contrib_data_args,
+)
diff --git a/contrib/pg_checksum_bench/pg_checksum_bench--1.0.sql
b/contrib/pg_checksum_bench/pg_checksum_bench--1.0.sql
new file mode 100644
index 00000000000..5f13cbe3c5e
--- /dev/null
+++ b/contrib/pg_checksum_bench/pg_checksum_bench--1.0.sql
@@ -0,0 +1,8 @@
+/* contrib/pg_checksum_bench/pg_checksum_bench--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+-- \echo Use "CREATE EXTENSION pg_checksum_bench" to load this file. \quit
+
+CREATE FUNCTION drive_pg_checksum(page_count int)
+ RETURNS pg_catalog.void
+ AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/contrib/pg_checksum_bench/pg_checksum_bench.c b/contrib/pg_checksum_bench/pg_checksum_bench.c
new file mode 100644
index 00000000000..f40f335ff59
--- /dev/null
+++ b/contrib/pg_checksum_bench/pg_checksum_bench.c
@@ -0,0 +1,34 @@
+#include "postgres.h"
+#include "fmgr.h"
+#include "storage/checksum_impl.h"
+
+#include <stdio.h>
+#include <assert.h>
+
+PG_MODULE_MAGIC;
+
+#define REPEATS 1000000
+
+PG_FUNCTION_INFO_V1(drive_pg_checksum);
+Datum
+drive_pg_checksum(PG_FUNCTION_ARGS)
+{
+ int page_count = PG_GETARG_INT32(0);
+
+ PGChecksummablePage * pages = palloc(page_count * sizeof(PGChecksummablePage));
+ srand(0);
+ for (size_t i = 0; i < page_count * sizeof(PGChecksummablePage); i++){
+ char * byte_ptr = (char *) pages;
+ byte_ptr[i] = rand() % 256;
+ }
+
+ for (int i = 0; i < REPEATS; i++){
+ const PGChecksummablePage * test_page = pages + (i % page_count);
+ volatile uint32 result = pg_checksum_block(test_page);
+ (void) result;
+ }
+
+ pfree((void *) pages);
+
+ PG_RETURN_VOID();
+}
diff --git a/contrib/pg_checksum_bench/pg_checksum_bench.control b/contrib/pg_checksum_bench/pg_checksum_bench.control
new file mode 100644
index 00000000000..4a4e2c9363c
--- /dev/null
+++ b/contrib/pg_checksum_bench/pg_checksum_bench.control
@@ -0,0 +1,4 @@
+comment = 'pg_checksum benchmark'
+default_version = '1.0'
+module_pathname = '$libdir/pg_checksum_bench'
+relocatable = true
diff --git a/contrib/pg_checksum_bench/sql/pg_checksum_bench.sql b/contrib/pg_checksum_bench/sql/pg_checksum_bench.sql
new file mode 100644
index 00000000000..4b347699953
--- /dev/null
+++ b/contrib/pg_checksum_bench/sql/pg_checksum_bench.sql
@@ -0,0 +1,17 @@
+CREATE EXTENSION pg_checksum_bench;
+
+SELECT drive_pg_checksum(-1);
+
+\timing on
+
+SELECT drive_pg_checksum(1);
+SELECT drive_pg_checksum(2);
+SELECT drive_pg_checksum(4);
+SELECT drive_pg_checksum(8);
+SELECT drive_pg_checksum(16);
+SELECT drive_pg_checksum(32);
+SELECT drive_pg_checksum(64);
+SELECT drive_pg_checksum(128);
+SELECT drive_pg_checksum(256);
+SELECT drive_pg_checksum(512);
+SELECT drive_pg_checksum(1024);
--
2.43.0
> I don't know if this is related to the crashes, but it doesn't seem
> like a good idea to #include the function pointer stuff everywhere,
> that should probably go into src/port like the others.
Just a gentle reminder on this patch series — I’ve already rebased it on
the latest tip of master and addressed the earlier review comments:
* Moved the function pointer definitions into src/port as suggested.
* Rebased cleanly on the current master branch.
Could someone take another look and share any further feedback?
Thanks a lot for your time and review,
Andrew Kim
> like a good idea to #include the function pointer stuff everywhere,
> that should probably go into src/port like the others.
Just a gentle reminder on this patch series — I’ve already rebased it on
the latest tip of master and addressed the earlier review comments:
* Moved the function pointer definitions into src/port as suggested.
* Rebased cleanly on the current master branch.
Could someone take another look and share any further feedback?
Thanks a lot for your time and review,
Andrew Kim
On Thu, Sep 11, 2025 at 1:55 PM root <tenistarkim@gmail.com> wrote: > Thanks for the feedback. This is v5 of the patchset, updated following your comments: > > - Moved the function pointer definitions out of common headers and > into src/port, consistent with existing practice. There is no attachment in this thread, so it's not showing up in the commitfest entry (which will need to be moved to next open commitfest), so it's not getting CI testing: https://commitfest.postgresql.org/patch/5726/ Note that the whole series must be attached in a single email, or it won't get automated testing. -- John Naylor Amazon Web Services
On Tue, Sep 23, 2025 at 11:32 PM John Naylor johncnaylorls@gmail.com wrote: There is no attachment in this thread, so it's not showing up in the commitfest entry (which will need to be moved to next open commitfest), so it's not getting CI testing: https://commitfest.postgresql.org/patch/5726/ Note that the whole series must be attached in a single email, or it won't get automated testing. Thanks, John. I see the issue now — I’ll attach the entire patch series in a single email so it shows up properly in the commitfest and gets CI coverage. Please find attached v6 of the patchset, updated per your feedback. Best regards, Andrew Kim On Tue, Sep 23, 2025 at 11:32 PM John Naylor <johncnaylorls@gmail.com> wrote: > > On Thu, Sep 11, 2025 at 1:55 PM root <tenistarkim@gmail.com> wrote: > > Thanks for the feedback. This is v5 of the patchset, updated following your comments: > > > > - Moved the function pointer definitions out of common headers and > > into src/port, consistent with existing practice. > > There is no attachment in this thread, so it's not showing up in the > commitfest entry (which will need to be moved to next open > commitfest), so it's not getting CI testing: > > https://commitfest.postgresql.org/patch/5726/ > > Note that the whole series must be attached in a single email, or it > won't get automated testing. > > -- > John Naylor > Amazon Web Services
Вложения
On Thu, Sep 25, 2025 at 4:50 AM Andrew Kim <tenistarkim@gmail.com> wrote: > > Thanks, John. I see the issue now — I’ll attach the entire patch > series in a single email so it shows up properly in the commitfest and > gets CI coverage. It's still picking up v4, and the archive link doesn't show any further replies. Something must have happened with the email threading, since you weren't on the thread at first. Please create an account and edit the entry to point to a more recent message ID: https://commitfest.postgresql.org/patch/5726/ > Please find attached v6 of the patchset, updated per your feedback. Thanks. (BTW, we discourage top-posting and prefer to cut to size and use inline responses) This is not a complete review, but some architectural thoughts and some things I've noticed. The top of the checksum_impl.h has this: * This file exists for the benefit of external programs that may wish to * check Postgres page checksums. They can #include this to get the code * referenced by storage/checksum.h. (Note: you may need to redefine * Assert() as empty to compile this successfully externally.) It's going to be a bit tricky to preserve this ability while allowing the core server and client programs to dispatch to a specialized implementation, but we should at least try. That means keeping pg_checksum_block() and pg_checksum_page() where they live now. I think a good first refactoring patch would be to move src/backend/storage/checksum.c (which your patch doesn't even touch) to src/port (and src/include/storage/checksum.h to src/include/port) and have all callers use that. With that, I imagine only that checksum.c file would include checksum_impl.h. If that poses a problem, let us know -- we may have to further juggle things. If that works without issue, we can proceed with the specialization. On that, just a few things to note here, although the next patch doesn't need to worry about any of this yet: + #if defined(__has_attribute) && __has_attribute (target) + __attribute__((target("avx2"))) + #endif + static int avx2_test(void) + { + const char buf@<:@sizeof(__m256i)@:>@; + __m256i accum = _mm256_loadu_si256((const __m256i *) buf); + accum = _mm256_add_epi32(accum, accum); + int result = _mm256_extract_epi32(accum, 0); + return (int) result; + }], If we're just testing if the target works, we can just use an empty function, right? +#define PG_DECLARE_CHECKSUM_ISA(ISANAME) \ +uint32 \ +pg_checksum_block_##ISANAME(const PGChecksummablePage *page); + +#define PG_DEFINE_CHECKSUM_ISA(ISANAME) \ +pg_attribute_target(#ISANAME) \ +uint32 pg_checksum_block_##ISANAME(const PGChecksummablePage *page) \ I find this hard to read compared to just using the actual name. +avx2_available(void) +{ +#if defined (USE_AVX2_WITH_RUNTIME_CHECK) && defined(__x86_64__) Why guard on __x86_64__? +PG_DEFINE_CHECKSUM_ISA(default) +{ + uint32 sums[N_SUMS], result = 0; + uint32 i, j; [...] +#ifdef USE_AVX2_WITH_RUNTIME_CHECK +PG_DEFINE_CHECKSUM_ISA(avx2) +{ + uint32 sums[N_SUMS], result = 0; + uint32 i, j; [...] With the single src/port file idea above, these would just do "return pg_checksum_block()" (or pg_checksum_page, whichever makes more sense). + if (avx2_available()) + { + /* optional: patch pointer so next call goes directly */ + pg_checksum_block = pg_checksum_block_avx2; + return pg_checksum_block_avx2(page); + } Not sure what your referring to here by "patching" the pointer, but it sounds dangerous. Besides, the cost of indirection is basically zero for multi-kilobyte inputs, so there is not even any motivation to consider doing differently. -- John Naylor Amazon Web Services
Hi John,
Thank you for your detailed and constructive feedback on the checksum
AVX2 optimization patch.
I've carefully addressed all of your concerns and am pleased to share
the updated V6 implementation.
V6 Implementation adds SIMD-optimized checksum calculation using AVX2
instructions with automatic fallback to portable implementation,
incorporating all of your recommended improvements:
1. Code Organization
Consolidated architecture: Moved all checksum logic into a single
checksum.c file, eliminating the complexity of separate dispatch files
Simplified build integration: Streamlined both autoconf and meson
build configurations
2. Safety & Robustness
Eliminated dangerous runtime patching: Replaced direct function
pointer manipulation with safe dispatch through static function
pointers
Thread-safe design: All operations are now inherently thread-safe
without requiring locks or synchronization
3. Code Readability
Removed macro complexity: Replaced PG_DECLARE_CHECKSUM_ISA macros with
explicit, clear function declarations
PostgreSQL coding compliance: Follows established PostgreSQL
conventions throughout
Simplified conditional compilation: Removed redundant __x86_64__
guards, relying on configure script's platform detection
4. Compiler Detection & Compatibility
Preserved robust testing: Maintained the comprehensive avx2_test
function that validates both __attribute__((target("avx2"))) support
and AVX2 intrinsics functionality
Runtime feature detection: Uses __builtin_cpu_supports("avx2") for
reliable CPU capability detection
Build cleanly across all library variants (static, shared, server)
Compile without warnings under strict compiler flags
I believe this V6 implementation fully addresses your concerns while
delivering the performance benefits of AVX2 optimization.
Please find the V6 patch attached. I welcome any additional feedback
you may have.
Best regards,
Andrew Kim
On Wed, Oct 1, 2025 at 10:26 PM John Naylor <johncnaylorls@gmail.com> wrote:
>
> On Thu, Sep 25, 2025 at 4:50 AM Andrew Kim <tenistarkim@gmail.com> wrote:
> >
> > Thanks, John. I see the issue now — I’ll attach the entire patch
> > series in a single email so it shows up properly in the commitfest and
> > gets CI coverage.
>
> It's still picking up v4, and the archive link doesn't show any
> further replies. Something must have happened with the email
> threading, since you weren't on the thread at first. Please create an
> account and edit the entry to point to a more recent message ID:
>
> https://commitfest.postgresql.org/patch/5726/
>
> > Please find attached v6 of the patchset, updated per your feedback.
>
> Thanks. (BTW, we discourage top-posting and prefer to cut to size and
> use inline responses)
>
> This is not a complete review, but some architectural thoughts and
> some things I've noticed.
>
> The top of the checksum_impl.h has this:
>
> * This file exists for the benefit of external programs that may wish to
> * check Postgres page checksums. They can #include this to get the code
> * referenced by storage/checksum.h. (Note: you may need to redefine
> * Assert() as empty to compile this successfully externally.)
>
> It's going to be a bit tricky to preserve this ability while allowing
> the core server and client programs to dispatch to a specialized
> implementation, but we should at least try. That means keeping
> pg_checksum_block() and pg_checksum_page() where they live now.
>
> I think a good first refactoring patch would be to move
> src/backend/storage/checksum.c (which your patch doesn't even touch)
> to src/port (and src/include/storage/checksum.h to src/include/port)
> and have all callers use that. With that, I imagine only that
> checksum.c file would include checksum_impl.h.
>
> If that poses a problem, let us know -- we may have to further juggle
> things. If that works without issue, we can proceed with the
> specialization. On that, just a few things to note here, although the
> next patch doesn't need to worry about any of this yet:
>
> + #if defined(__has_attribute) && __has_attribute (target)
> + __attribute__((target("avx2")))
> + #endif
> + static int avx2_test(void)
> + {
> + const char buf@<:@sizeof(__m256i)@:>@;
> + __m256i accum = _mm256_loadu_si256((const __m256i *) buf);
> + accum = _mm256_add_epi32(accum, accum);
> + int result = _mm256_extract_epi32(accum, 0);
> + return (int) result;
> + }],
>
> If we're just testing if the target works, we can just use an empty
> function, right?
>
> +#define PG_DECLARE_CHECKSUM_ISA(ISANAME) \
> +uint32 \
> +pg_checksum_block_##ISANAME(const PGChecksummablePage *page);
> +
> +#define PG_DEFINE_CHECKSUM_ISA(ISANAME) \
> +pg_attribute_target(#ISANAME) \
> +uint32 pg_checksum_block_##ISANAME(const PGChecksummablePage *page) \
>
> I find this hard to read compared to just using the actual name.
>
> +avx2_available(void)
> +{
> +#if defined (USE_AVX2_WITH_RUNTIME_CHECK) && defined(__x86_64__)
>
> Why guard on __x86_64__?
>
> +PG_DEFINE_CHECKSUM_ISA(default)
> +{
> + uint32 sums[N_SUMS], result = 0;
> + uint32 i, j;
> [...]
>
> +#ifdef USE_AVX2_WITH_RUNTIME_CHECK
> +PG_DEFINE_CHECKSUM_ISA(avx2)
> +{
> + uint32 sums[N_SUMS], result = 0;
> + uint32 i, j;
> [...]
>
> With the single src/port file idea above, these would just do "return
> pg_checksum_block()" (or pg_checksum_page, whichever makes more
> sense).
>
> + if (avx2_available())
> + {
> + /* optional: patch pointer so next call goes directly */
> + pg_checksum_block = pg_checksum_block_avx2;
> + return pg_checksum_block_avx2(page);
> + }
>
> Not sure what your referring to here by "patching" the pointer, but it
> sounds dangerous. Besides, the cost of indirection is basically zero
> for multi-kilobyte inputs, so there is not even any motivation to
> consider doing differently.
>
> --
> John Naylor
> Amazon Web Services
Вложения
Greetings!
I've also tried to use AVX2 to speedup checksums and I've found your
approach quite interesting
But I see some issues with v6 patch
1) checksum.c was moved to src/port, but special meson rules are left in
src/backend/storage/page/meson.build. As a result, assembly code for
moved src/port/checksum.c doesn't use -funroll-loops and
-ftree-vectorize (latter isn't probably needed now, due to the nature of
the patch). The same is true for src/port/Makefile, there are no
instructions to use CFLAGS_UNROLL_LOOPS and CFLAGS_VECTORIZE
2) checksum.c was moved to src/port, but checksum.h and checksum_impl.h
are left in src/include/storage. I think they both should be moved to
src/include/port, as John Naylor suggested in his review of v5
3) checksum_impl.h now doesn't provide any code, so including it in
external programs won't allow checksum calculation. I think that all
code should be in checksum_impl.h, and external programs could just
define USE_AVX2_WITH_RUNTIME_CHECK (probably using similar checks as we
are) to use AVX2 implementation. If not - then they will default to
default realisation
4) I don't understand why do we need to check for AVX2 intrinsics if we
don't use those in code (at least I don't see them directly)? As in
review of v5, couldn't test functions in configure, config/c-compiler.m4
and ./meson.build just be {return 0;} or {return 1;}?
5) Why do we need both src/backend/storage/page/checksum.c and
src/port/checksum.c?
6)
> +/* Function declarations for ISA-specific implementations */
> +uint32 pg_checksum_block_default(const PGChecksummablePage *page);
> +#ifdef USE_AVX2_WITH_RUNTIME_CHECK
> +uint32 pg_checksum_block_avx2(const PGChecksummablePage *page);
> +#endif
What is "ISA-specific implementations" in this comment? Maybe I'm just
not familiar with the term? Or is it an artifact from macro
implementation?
7) Why remove all comments from code of pg_checksum_block_default? I
could understand if you just removed comments from
pg_checksum_block_avx2, since it just duplicates code (though I
personally would leave all the comments even when duplicating code), but
I don't understand removing comments from pg_checksum_block_default
8) It might be a personal taste, but pg_checksum_block_dispatch looks
more like "choose" function from src/port/pg_crc32c_sse42_choose.c and
alike. "dispatch" from src/include/port/pg_crc32c looks a little
different - we don't choose function pointer once there, we choose
between inlined computation and calling a function with runtime check.
So I'd suggest changing name of pg_checksum_block_dispatch to
pg_checksum_block_choose
Other than those, I think the core of this patch is good
Oleg Tselebrovskiy, PostgresPro
Hi Oleg,
Thank you very much for the detailed and constructive feedback on v6 patch.
It was extremely helpful in refining the architecture and ensuring
compliance with PostgreSQL coding standards.
I have updated the patch to V7, which I believe addresses all of your
points, including the critical architectural concerns regarding file
organization and linking.
Key Changes and Feedback Resolution in V7
The architecture is now consolidated in the src/port module.
1. Compiler Flags (Unroll/Vectorize)Resolved: Compiler flags
(CFLAGS_UNROLL_LOOPS) are now correctly placed and applied to
checksum.c in src/port/Makefile and src/port/meson.
2. Header OrganizationResolved: checksum.h and checksum_impl.h have
been moved from src/include/storage/ to src/include/port/ for
consistent module organization.
3. External Program CompatibilityResolved: checksum_impl.h is now
fully self-contained. It provides the static inline implementations
(pg_checksum_block_default, pg_checksum_block_avx2) and all required
constants, ensuring external tools can calculate checksums without
linking to the backend library.
4. Duplicate FilesResolved: The redundant
src/backend/storage/page/checksum.c file has been removed,
consolidating all implementation logic into src/port/checksum.c.
5. Function NamingResolved: The dispatch pattern now uses
pg_checksum_block_choose, aligning with the established naming
conventions (e.g., CRC32C module). The implementations use the clear
names pg_checksum_block_default and pg_checksum_block_avx2.
7. Documentation/CommentsResolved: Comprehensive documentation,
including the detailed FNV-1a algorithm comments, has been restored to
the portable implementation (pg_checksum_block_default).
Best regards,
Andrew Kim
On Fri, Oct 17, 2025 at 3:53 AM Oleg Tselebrovskiy
<o.tselebrovskiy@postgrespro.ru> wrote:
>
> Greetings!
>
> I've also tried to use AVX2 to speedup checksums and I've found your
> approach quite interesting
>
> But I see some issues with v6 patch
>
> 1) checksum.c was moved to src/port, but special meson rules are left in
> src/backend/storage/page/meson.build. As a result, assembly code for
> moved src/port/checksum.c doesn't use -funroll-loops and
> -ftree-vectorize (latter isn't probably needed now, due to the nature of
> the patch). The same is true for src/port/Makefile, there are no
> instructions to use CFLAGS_UNROLL_LOOPS and CFLAGS_VECTORIZE
>
> 2) checksum.c was moved to src/port, but checksum.h and checksum_impl.h
> are left in src/include/storage. I think they both should be moved to
> src/include/port, as John Naylor suggested in his review of v5
>
> 3) checksum_impl.h now doesn't provide any code, so including it in
> external programs won't allow checksum calculation. I think that all
> code should be in checksum_impl.h, and external programs could just
> define USE_AVX2_WITH_RUNTIME_CHECK (probably using similar checks as we
> are) to use AVX2 implementation. If not - then they will default to
> default realisation
>
> 4) I don't understand why do we need to check for AVX2 intrinsics if we
> don't use those in code (at least I don't see them directly)? As in
> review of v5, couldn't test functions in configure, config/c-compiler.m4
> and ./meson.build just be {return 0;} or {return 1;}?
>
> 5) Why do we need both src/backend/storage/page/checksum.c and
> src/port/checksum.c?
>
> 6)
> > +/* Function declarations for ISA-specific implementations */
> > +uint32 pg_checksum_block_default(const PGChecksummablePage *page);
> > +#ifdef USE_AVX2_WITH_RUNTIME_CHECK
> > +uint32 pg_checksum_block_avx2(const PGChecksummablePage *page);
> > +#endif
>
> What is "ISA-specific implementations" in this comment? Maybe I'm just
> not familiar with the term? Or is it an artifact from macro
> implementation?
>
> 7) Why remove all comments from code of pg_checksum_block_default? I
> could understand if you just removed comments from
> pg_checksum_block_avx2, since it just duplicates code (though I
> personally would leave all the comments even when duplicating code), but
> I don't understand removing comments from pg_checksum_block_default
>
> 8) It might be a personal taste, but pg_checksum_block_dispatch looks
> more like "choose" function from src/port/pg_crc32c_sse42_choose.c and
> alike. "dispatch" from src/include/port/pg_crc32c looks a little
> different - we don't choose function pointer once there, we choose
> between inlined computation and calling a function with runtime check.
> So I'd suggest changing name of pg_checksum_block_dispatch to
> pg_checksum_block_choose
>
> Other than those, I think the core of this patch is good
>
> Oleg Tselebrovskiy, PostgresPro
Вложения
Thanks for the new patch version!
Another round of review:
1) I think that changes to contrib/pageinspect/rawpage.c should be in
the main patch, not in the benchmark patch. Also, without those chages
the main patch can't compile using make world-bin
2) I still don't get why you check for working intrinsics in configure,
config/c-compiler.m4 and meson.build, if your patch later uses them.
I've gotten correct assembly code with this avx2_test function:
#include <stdint.h>
#if defined(__has_attribute) && __has_attribute (target)
__attribute__((target("avx2")))
static int avx2_test(void)
{
return 0;
}
#endif
Please, check if this works for you and consider using something similar
3) __builtin_cpu_supports doesn't work on Windows at all. We still have
to use approach with __get_cpuid
4) Looks like you can safely remove "port/checksum_impl.h" from
src/bin/pg_checksums/pg_checksums.c. It probably links with libpgport
and/or libpgcommon, so it gets pg_checksum_page from there. Same with
src/bin/pg_upgrade/file.c. Maybe those includes are "for clarity" and
you don't need to remove them, but pg_checksums and pg_upgrade seem to
work without them
5) You don't need #include <string.h> /* for memcpy */ in
checksum_impl.h. At the very least, memcpy was used before your patch
without string.h
6) Why did you remove Assert(sizeof(PGChecksummablePage) == BLCKSZ)? Is
it always false?
7) Is reformatted variable declaration in pg_checksum_block_default_impl
really needed? Is there a good reason for it? Or is it auto-formatting
programm output?
8) Your patch removes one whitespace in this line - for (i = 0; i <
(uint32)(BLCKSZ / (sizeof(uint32) * N_SUMS)); i++)
If you wish to fix formatting like that - please, do it in a separate
patch. If this was done automatically by some formatting tool - please,
revert this change
9) Unneeded empty string
#define FNV_PRIME 16777619
+
/* Use a union so that this code is valid under strict aliasing */
typedef union
10) You need one line with just /* at the beginning of the comment, look
at other multiline comments in this file
+ /* For now, AVX2 implementation is identical to default
+ * The compiler will auto-vectorize this with proper flags
+ * Future versions could use explicit AVX2 intrinsics here
*/
11) Function pg_checksum_block_simple isn't used at all.
12) Why do you need those?
+#ifndef PG_CHECKSUM_EXTERNAL_INTERFACE
+#define PG_CHECKSUM_EXTERNAL_INTERFACE
13) Object files are added according to alphabetical order, not logical
order (src/port/Makefile)
pg_popcount_aarch64.o \
pg_popcount_avx512.o \
+ checksum.o \
pg_strong_random.o \
pgcheckdir.o \
14) I still think that src/port/checksum.c needs to just include
src/include/port/checksum_impl.h and have no other logic to keep
checksum_impl.h's role as "header with full implementation"
Now checksum_impl.h doesn't have any mention of pg_checksum_page
15) Assembly for pg_checksum_block_choose now has full code of
pg_checksum_block_default. This is probably a result of using inline
functions
Don't know if this is bad, but it is at least strange
Also, some CFBot checks have failed. Two of them with this error/warning
checksum.c:88:1: error: no previous prototype for ‘pg_checksum_page’
[-Werror=missing-prototypes]
88 | pg_checksum_page(char *page, BlockNumber blkno)
| ^~~~~~~~~~~~~~~~
Please, address those
Oleg Tselebrovskiy, PostgresPro
On Fri, Oct 17, 2025 at 2:15 PM Andrew Kim <tenistarkim@gmail.com> wrote:
>
> Hi John,
>
> Thank you for your detailed and constructive feedback on the checksum
> AVX2 optimization patch.
> I've carefully addressed all of your concerns and am pleased to share
> the updated V6 implementation.
Great! I know we're on v7 now, but I'm going to make a request for
next time you respond to a review: Respond in-line to each point. As I
mentioned before,
> On Wed, Oct 1, 2025 at 10:26 PM John Naylor <johncnaylorls@gmail.com> wrote:
> > (BTW, we discourage top-posting and prefer to cut to size and
> > use inline responses)
Please don't top-post again, as it clutters our archives in addition
to making it easy to forget things. I'm now going to copy the things
that were either not addressed or misunderstood:
> > I think a good first refactoring patch would be to move
> > src/backend/storage/checksum.c (which your patch doesn't even touch)
> > to src/port (and src/include/storage/checksum.h to src/include/port)
> > and have all callers use that. With that, I imagine only that
> > checksum.c file would include checksum_impl.h.
> >
> > If that poses a problem, let us know -- we may have to further juggle
> > things. If that works without issue, we can proceed with the
> > specialization.
That means the first patch moves things around without adding any
platform-specific code, and the next patch adds the specialization. I
think that would be a lot easier to review and test, especially to
avoid breaking external programs (see below for more on this). A
committer can always squash things together if it make sense to do so.
> > + #if defined(__has_attribute) && __has_attribute (target)
> > + __attribute__((target("avx2")))
> > + #endif
> > + static int avx2_test(void)
> > + {
> > + const char buf@<:@sizeof(__m256i)@:>@;
> > + __m256i accum = _mm256_loadu_si256((const __m256i *) buf);
> > + accum = _mm256_add_epi32(accum, accum);
> > + int result = _mm256_extract_epi32(accum, 0);
> > + return (int) result;
> > + }],
> >
> > If we're just testing if the target works, we can just use an empty
> > function, right?
Oleg mentioned the same thing later. It's a waste of time for us to
repeat ourselves. I said you didn't have to worry about it yet,
because I was hoping to see the refactoring first.
Now, aside from that I looked further into this:
> > The top of the checksum_impl.h has this:
> >
> > * This file exists for the benefit of external programs that may wish to
> > * check Postgres page checksums. They can #include this to get the code
> > * referenced by storage/checksum.h. (Note: you may need to redefine
> > * Assert() as empty to compile this successfully externally.)
> >
> > It's going to be a bit tricky to preserve this ability while allowing
> > the core server and client programs to dispatch to a specialized
> > implementation, but we should at least try. That means keeping
> > pg_checksum_block() and pg_checksum_page() where they live now.
Looking at commit f04216341dd1, we have at least one example of an
external program, pg_filedump. If we can keep this working with
minimal fuss, it should be fine everywhere.
https://github.com/df7cb/pg_filedump/blob/master/pg_filedump.c#L29
```
/* checksum_impl.h uses Assert, which doesn't work outside the server */
#undef Assert
#define Assert(X)
#include "storage/checksum.h"
#include "storage/checksum_impl.h"
```
Elsewhere they already have to do things like
```
#if PG_VERSION_NUM < 110000
" Previous Checkpoint Record: Log File (%u) Offset (0x%08x)\n"
#endif
```
...so it's probably okay if they have to adjust for a new #include
path, but I want to verify that actually works, and I don't want to
make it any more invasive than that. As we proceed, I can volunteer to
do the work to test that pg_filedump still builds fine with small
changes. Feel free to try building it yourself, but I'm happy to do
it.
Oleg posted another review recently, so I won't complicate things
further, but from a brief glance I will suggest for next time not to
change any comments that haven't been invalidated by the patch.
--
John Naylor
Amazon Web Services
Hi John,
Thank you for your review on the previous patch versions.
I've carefully addressed your concerns and those raised by Oleg,
specifically focusing on patch separation and simplification of the
configure tests. I am submitting the new version (V8) as two distinct
patches:
V8-0001: Pure refactoring (moving files, updating includes).
V8-0002: Adding the AVX2 feature (detection, dispatch, and optimization).
As requested, I've used in-line responses below to clarify how each
point was handled.
On Mon, Oct 20, 2025 at 8:30 PM John Naylor <johncnaylorls@gmail.com> wrote:
>
> On Fri, Oct 17, 2025 at 2:15 PM Andrew Kim <tenistarkim@gmail.com> wrote:
> >
> > Hi John,
> >
> > Thank you for your detailed and constructive feedback on the checksum
> > AVX2 optimization patch.
> > I've carefully addressed all of your concerns and am pleased to share
> > the updated V6 implementation.
>
> Great! I know we're on v7 now, but I'm going to make a request for
> next time you respond to a review: Respond in-line to each point. As I
> mentioned before,
>
> > On Wed, Oct 1, 2025 at 10:26 PM John Naylor <johncnaylorls@gmail.com> wrote:
> > > (BTW, we discourage top-posting and prefer to cut to size and
> > > use inline responses)
>
> Please don't top-post again, as it clutters our archives in addition
> to making it easy to forget things. I'm now going to copy the things
> that were either not addressed or misunderstood:
>
I apologize for the top-posting in the previous response. I've
switched to the preferred in-line response format for this and all
future correspondence.
> > > I think a good first refactoring patch would be to move
> > > src/backend/storage/checksum.c (which your patch doesn't even touch)
> > > to src/port (and src/include/storage/checksum.h to src/include/port)
> > > and have all callers use that. With that, I imagine only that
> > > checksum.c file would include checksum_impl.h.
> > >
> > > If that poses a problem, let us know -- we may have to further juggle
> > > things. If that works without issue, we can proceed with the
> > > specialization.
>
> That means the first patch moves things around without adding any
> platform-specific code, and the next patch adds the specialization. I
> think that would be a lot easier to review and test, especially to
> avoid breaking external programs (see below for more on this). A
> committer can always squash things together if it make sense to do so.
>
Patch V8-0001 (Move-checksum-functions...): This is now a pure
refactoring patch. It simply moves checksum.c and its headers from
storage/ to port/ and updates the #include paths in all callers
(rawpage.c, pg_checksums.c, etc.). It contains no AVX2 or ISA-specific
code.
Patch V8-0002 (Add-AVX2-optimization...): This patch builds upon the
first, adding all the new AVX2 functionality, detection, and dispatch
logic.
> > > + #if defined(__has_attribute) && __has_attribute (target)
> > > + __attribute__((target("avx2")))
> > > + #endif
> > > + static int avx2_test(void)
> > > + {
> > > + const char buf@<:@sizeof(__m256i)@:>@;
> > > + __m256i accum = _mm256_loadu_si256((const __m256i *) buf);
> > > + accum = _mm256_add_epi32(accum, accum);
> > > + int result = _mm256_extract_epi32(accum, 0);
> > > + return (int) result;
> > > + }],
> > >
> > > If we're just testing if the target works, we can just use an empty
> > > function, right?
>
> Oleg mentioned the same thing later. It's a waste of time for us to
> repeat ourselves. I said you didn't have to worry about it yet,
> because I was hoping to see the refactoring first.
>
I have implemented this simplification in Patch V8-0002. The test in
config/c-compiler.m4 is now a simple, empty function with only the
__attribute__((target("avx2"))) to verify compiler support for the
attribute, as suggested.
> Now, aside from that I looked further into this:
>
> > > The top of the checksum_impl.h has this:
> > >
> > > * This file exists for the benefit of external programs that may wish to
> > > * check Postgres page checksums. They can #include this to get the code
> > > * referenced by storage/checksum.h. (Note: you may need to redefine
> > > * Assert() as empty to compile this successfully externally.)
> > >
> > > It's going to be a bit tricky to preserve this ability while allowing
> > > the core server and client programs to dispatch to a specialized
> > > implementation, but we should at least try. That means keeping
> > > pg_checksum_block() and pg_checksum_page() where they live now.
>
> Looking at commit f04216341dd1, we have at least one example of an
> external program, pg_filedump. If we can keep this working with
> minimal fuss, it should be fine everywhere.
The v8 patch series preserves external compatibility. External
programs like pg_filedump will only need to update their include
paths:
/* OLD */
#include "storage/checksum.h"
#include "storage/checksum_impl.h"
/* NEW */
#include "port/checksum.h"
#include "port/checksum_impl.h"
The function signatures (pg_checksum_block, pg_checksum_page) remain
identical, and checksum_impl.h still contains the complete
implementation that external programs can include. The runtime
dispatch only affects internal PostgreSQL usage.
/* OLD */#include "storage/checksum.h"#include
"storage/checksum_impl.h"/* NEW */ #include "port/checksum.h"#include
"port/checksum_impl.h"
> https://github.com/df7cb/pg_filedump/blob/master/pg_filedump.c#L29
>
> ```
> /* checksum_impl.h uses Assert, which doesn't work outside the server */
> #undef Assert
> #define Assert(X)
>
> #include "storage/checksum.h"
> #include "storage/checksum_impl.h"
> ```
>
> Elsewhere they already have to do things like
>
> ```
> #if PG_VERSION_NUM < 110000
> " Previous Checkpoint Record: Log File (%u) Offset (0x%08x)\n"
> #endif
> ```
>
> ...so it's probably okay if they have to adjust for a new #include
> path, but I want to verify that actually works, and I don't want to
> make it any more invasive than that. As we proceed, I can volunteer to
> do the work to test that pg_filedump still builds fine with small
> changes. Feel free to try building it yourself, but I'm happy to do
> it.
I appreciate your offer to test pg_filedump compatibility. The changes
in v8 should be minimal for external programs - just the include path
updates. If you're willing to test this, it would be very valuable
validation.
>
> Oleg posted another review recently, so I won't complicate things
> further, but from a brief glance I will suggest for next time not to
> change any comments that haven't been invalidated by the patch.
>
In v8, I've been much more conservative about comment changes. I only
updated comments that were directly invalidated by the code changes
(like file path references that changed from storage/ to port/). Other
comments remain untouched unless they were factually incorrect due to
the refactoring.
> --
> John Naylor
> Amazon Web Services
Вложения
Hi Oleg,
Thank you for the detailed review on v7 patch.
On Mon, Oct 20, 2025 at 8:05 AM Oleg Tselebrovskiy
<o.tselebrovskiy@postgrespro.ru> wrote:
>
> Thanks for the new patch version!
>
> Another round of review:
>
> 1) I think that changes to contrib/pageinspect/rawpage.c should be in
> the main patch, not in the benchmark patch. Also, without those chages
> the main patch can't compile using make world-bin
>
This is already correctly handled in v8. The
contrib/pageinspect/rawpage.c change is in the main patch (v8-0001),
not in the benchmark patch. The include statement was updated from
#include "storage/checksum.h" to #include "port/checksum.h" in the
refactoring patch, which is the correct placement.
> 2) I still don't get why you check for working intrinsics in configure,
> config/c-compiler.m4 and meson.build, if your patch later uses them.
> I've gotten correct assembly code with this avx2_test function:
> #include <stdint.h>
> #if defined(__has_attribute) && __has_attribute (target)
> __attribute__((target("avx2")))
> static int avx2_test(void)
> {
> return 0;
> }
> #endif
> Please, check if this works for you and consider using something similar
>
I agree. In v8, I've simplified the configure tests significantly. The
config/c-compiler.m4 now uses exactly the pattern you suggested:
> 3) __builtin_cpu_supports doesn't work on Windows at all. We still have
> to use approach with __get_cpuid
>
Completely fixed in v8. I've removed all usage of
__builtin_cpu_supports and implemented proper cross-platform CPU
detection using __get_cpuid (Linux/GCC) and __cpuid (Windows/MSVC)
with proper preprocessor guards
> 4) Looks like you can safely remove "port/checksum_impl.h" from
> src/bin/pg_checksums/pg_checksums.c. It probably links with libpgport
> and/or libpgcommon, so it gets pg_checksum_page from there. Same with
> src/bin/pg_upgrade/file.c. Maybe those includes are "for clarity" and
> you don't need to remove them, but pg_checksums and pg_upgrade seem to
> work without them
>
In v8-0001, both files now only include "port/checksum.h". The direct
inclusion of checksum_impl.h has been removed:
src/bin/pg_checksums/pg_checksums.c: Only includes "port/checksum.h"
src/bin/pg_upgrade/file.c: Only includes "port/checksum.h"
> 5) You don't need #include <string.h> /* for memcpy */ in
> checksum_impl.h. At the very least, memcpy was used before your patch
> without string.h
>
Confirmed there's no explicit #include <string.h> in the v8
checksum_impl.h. The memcpy usage relies on the standard PostgreSQL
includes.
> 6) Why did you remove Assert(sizeof(PGChecksummablePage) == BLCKSZ)? Is
> it always false?
>
I didn't remove it - it's still present in both implementations in
v8. In both pg_checksum_block_default and pg_checksum_block_avx2, you
can see:
/* ensure that the size is compatible with the algorithm */
Assert(sizeof(PGChecksummablePage) == BLCKSZ);
> 7) Is reformatted variable declaration in pg_checksum_block_default_impl
> really needed? Is there a good reason for it? Or is it auto-formatting
> programm output?
>
Sorry, that's my mistake, In v8, I've kept the variable declarations
consistent with PostgreSQL style without unnecessary reformatting. The
declarations in both functions follow the same pattern as the original
code.
> 8) Your patch removes one whitespace in this line - for (i = 0; i <
> (uint32)(BLCKSZ / (sizeof(uint32) * N_SUMS)); i++)
> If you wish to fix formatting like that - please, do it in a separate
> patch. If this was done automatically by some formatting tool - please,
> revert this change
>
In v8, I've preserved the original formatting. The for loop maintains
the original spacing: for (i = 0; i < (uint32) (BLCKSZ /
(sizeof(uint32) * N_SUMS)); i++) with proper space after (uint32).
> 9) Unneeded empty string
> #define FNV_PRIME 16777619
>
> +
> /* Use a union so that this code is valid under strict aliasing */
> typedef union
>
Fixed, removed unnecessary blank lines in v8.
> 10) You need one line with just /* at the beginning of the comment, look
> at other multiline comments in this file
> + /* For now, AVX2 implementation is identical to default
> + * The compiler will auto-vectorize this with proper flags
> + * Future versions could use explicit AVX2 intrinsics here
> */
>
Fixed, it's started style with the opening /* on its own line:
/*
* AVX2-optimized block checksum algorithm.
* Same algorithm as default, but compiled with AVX2 target for
auto-vectorization.
*/
> 11) Function pg_checksum_block_simple isn't used at all.
>
There's no pg_checksum_block_simple function in v8. The implementation
only has the necessary functions: pg_checksum_block_default,
pg_checksum_block_avx2, and pg_checksum_block_choose.
> 12) Why do you need those?
> +#ifndef PG_CHECKSUM_EXTERNAL_INTERFACE
> +#define PG_CHECKSUM_EXTERNAL_INTERFACE
>
These macros are not present in v8. The implementation is cleaner
without unnecessary preprocessor guards.
> 13) Object files are added according to alphabetical order, not logical
> order (src/port/Makefile)
> pg_popcount_aarch64.o \
> pg_popcount_avx512.o \
> + checksum.o \
> pg_strong_random.o \
> pgcheckdir.o \
>
In v8, checksum.o is correctly placed in alphabetical order in the
OBJS list in src/port/Makefile:
OBJS = \
$(LIBOBJS) \
$(PG_CRC32C_OBJS) \
bsearch_arg.o \
checksum.o \
chklocale.o \
> 14) I still think that src/port/checksum.c needs to just include
> src/include/port/checksum_impl.h and have no other logic to keep
> checksum_impl.h's role as "header with full implementation"
> Now checksum_impl.h doesn't have any mention of pg_checksum_page
>
The current v8 approach has checksum.c simply include
checksum_impl.h, which maintains the "header with full implementation"
pattern you prefer. However, the function pointer mechanism and
runtime detection logic is in checksum_impl.h, which means
pg_checksum_page (the external interface) is also defined there. This
keeps the external interface clean while maintaining the
implementation details in the header.
> 15) Assembly for pg_checksum_block_choose now has full code of
> pg_checksum_block_default. This is probably a result of using inline
> functions
> Don't know if this is bad, but it is at least strange
>
I think that is expected behavior with the function pointer approach.
The compiler inlines the first call, but subsequent calls use the
cached function pointer, which is the standard PostgreSQL pattern for
runtime CPU feature detection (see CRC32C implementation).
> Also, some CFBot checks have failed. Two of them with this error/warning
> checksum.c:88:1: error: no previous prototype for ‘pg_checksum_page’
> [-Werror=missing-prototypes]
> 88 | pg_checksum_page(char *page, BlockNumber blkno)
> | ^~~~~~~~~~~~~~~~
> Please, address those
>
In v8, pg_checksum_page is declared in src/include/port/checksum.h,
which is included by checksum.c. This should resolve the missing
prototype error.
> Oleg Tselebrovskiy, PostgresPro