Re: Proposal for enabling auto-vectorization for checksum calculations

Поиск
Список
Период
Сортировка
От Andrew Kim
Тема Re: Proposal for enabling auto-vectorization for checksum calculations
Дата
Msg-id CAK64mne_oWN9d4mf+0c_5-4Emb9kRXA-OC05OJ4F_1fVqpjzDA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Proposal for enabling auto-vectorization for checksum calculations  (John Naylor <johncnaylorls@gmail.com>)
Ответы Re: Proposal for enabling auto-vectorization for checksum calculations
Список pgsql-hackers
Hi John,

Thanks for taking the time to review the v9-0001 refactoring patch and
for setting the CF entry to Needs Review.


On Fri, Nov 14, 2025 at 3:34 AM John Naylor <johncnaylorls@gmail.com> wrote:
>
> On Thu, Nov 6, 2025 at 6:50 AM Andrew Kim <tenistarkim@gmail.com> wrote:
> > The v9 patch series is attached.
>
> Thanks! BTW, I've set the commitfest entry to "Needs Review".
>
> I spent some time with the v9-0001 refactoring patch, and I have one
> observation that's worth sharing now:
>
> --- a/src/bin/pg_checksums/pg_checksums.c
> +++ b/src/bin/pg_checksums/pg_checksums.c
> @@ -29,8 +29,7 @@
>  #include "getopt_long.h"
>  #include "pg_getopt.h"
>  #include "storage/bufpage.h"
> -#include "storage/checksum.h"
> -#include "storage/checksum_impl.h"
> +#include "port/checksum.h"
>
> Outside the backend it is no longer required to include the
> implementation header -- it just links and works correctly. That seems
> like a good thing. In fact...
>
> On Tue, Oct 21, 2025 I wrote:
> > Looking at commit f04216341dd1, we have at least one example of an
> > external program, pg_filedump. If we can keep this working with
> > minimal fuss, it should be fine everywhere.
>
> > https://github.com/df7cb/pg_filedump/blob/master/pg_filedump.c#L29
> >
> > ```
> > /* checksum_impl.h uses Assert, which doesn't work outside the server */
> > #undef Assert
> > #define Assert(X)
> >
> > #include "storage/checksum.h"
> > #include "storage/checksum_impl.h"
> > ```
>
> I tried building pg_filedump against the server with just the 0001
> patch and it also builds fine leaving out the implementation:
>
> diff --git a/pg_filedump.c b/pg_filedump.c
> index 606a85b..0268381 100644
> --- a/pg_filedump.c
> +++ b/pg_filedump.c
> @@ -26,12 +26,7 @@
>
>  #include <utils/pg_crc.h>
>
> -/*     checksum_impl.h uses Assert, which doesn't work outside the server */
> -#undef Assert
> -#define Assert(X)
> -
> -#include "storage/checksum.h"
> -#include "storage/checksum_impl.h"
> +#include "port/checksum.h"
>  #include "decode.h"
>  #include <inttypes.h>
>
> I verified that it does in fact break when built against our master
> branch without the impl.h header.
>
> Further, if I replace the above CRC #include to point instead to our
> hardware-specific API in port/pg_crc32c.h, it builds fine after
> adjusting the typedef that it expects, and interprets the control file
> normally:
>
> <pg_control Contents> *********************************************
>
>                           CRC: Correct
>            pg_control Version: 1800
>               Catalog Version: 202511051
> ...
>
> I'll think more about this.
>

I've double-checked everything after applying the v9 checksum patches
and updating pg_filedump accordingly.
Following your suggestion, I removed the checksum_impl.h include and
the Assert redefinition, keeping only the port/checksum.h include.
build compiles cleanly with the new architecture, and pg_filedump
functions correctly with the AVX2 optimizations.
If you agree with this approach, I'd like to prepare a patch for
upstream submission.
Please feel free to guide me on the proper process for this. Should I
submit it to the pg_filedump repository, or would you prefer to handle
the integration as part of the v9 checksum patch series?
Thanks again for your testing and guidance!

> --
> John Naylor
> Amazon Web Services



В списке pgsql-hackers по дате отправления: