Re: [PATCH] Fix ARM64/MSVC atomic memory ordering issues on Win11 by adding explicit DMB ​barriers

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: [PATCH] Fix ARM64/MSVC atomic memory ordering issues on Win11 by adding explicit DMB ​barriers
Дата
Msg-id beirrgqo5n5e73dwa4dsdnlbtef3bsdv5sgarm6przdzxvifk5@whyuhyemmhyr
обсуждение исходный текст
Ответ на Re: [PATCH] Fix ARM64/MSVC atomic memory ordering issues on Win11 by adding explicit DMB ​barriers  ("Greg Burd" <greg@burd.me>)
Список pgsql-hackers
Hi,

On 2025-11-24 11:28:28 -0500, Greg Burd wrote:
> @@ -2509,25 +2513,64 @@ int main(void)
>  }
>  '''
>  
> -  if cc.links(prog, name: '__crc32cb, __crc32ch, __crc32cw, and __crc32cd without -march=armv8-a+crc',
> -      args: test_c_args)
> -    # Use ARM CRC Extension unconditionally
> -    cdata.set('USE_ARMV8_CRC32C', 1)
> -    have_optimized_crc = true
> -  elif cc.links(prog, name: '__crc32cb, __crc32ch, __crc32cw, and __crc32cd with -march=armv8-a+crc+simd',
> -      args: test_c_args + ['-march=armv8-a+crc+simd'])
> -    # Use ARM CRC Extension, with runtime check
> -    cflags_crc += '-march=armv8-a+crc+simd'
> -    cdata.set('USE_ARMV8_CRC32C', false)
> -    cdata.set('USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK', 1)
> -    have_optimized_crc = true
> -  elif cc.links(prog, name: '__crc32cb, __crc32ch, __crc32cw, and __crc32cd with -march=armv8-a+crc',
> -      args: test_c_args + ['-march=armv8-a+crc'])
> -    # Use ARM CRC Extension, with runtime check
> -    cflags_crc += '-march=armv8-a+crc'
> -    cdata.set('USE_ARMV8_CRC32C', false)
> -    cdata.set('USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK', 1)
> -    have_optimized_crc = true
> +  if cc.get_id() == 'msvc'
> +    # MSVC: Intrinsic availability check for ARM64
> +    if host_machine.cpu_family() == 'aarch64'
> +      # Test if CRC32C intrinsics are available in intrin.h
> +      crc32c_test_msvc = '''
> +        #include <intrin.h>
> +        int main(void) {
> +          uint32_t crc = 0;
> +          uint8_t data = 0;
> +          crc = __crc32cb(crc, data);
> +          return 0;
> +        }
> +      '''
> +      if cc.links(crc32c_test_msvc, name: '__crc32cb intrinsic available')
> +        cdata.set('USE_ARMV8_CRC32C', 1)
> +        have_optimized_crc = true
> +        message('Using ARM64 CRC32C hardware acceleration (MSVC)')
> +      else
> +        message('CRC32C intrinsics not available on this MSVC ARM64 build')
> +      endif

Does this:
a) need to be conditional at all, given that it's msvc specific, it seems we
   don't need to run a test?
b) why is the msvc block outside of the general aarch64 block but then has
another nested aarch64 test inside? That seems unnecessarily complicated and
requires reindenting unnecessarily much code?


> +/*
> + * For Arm64, use __isb intrinsic. See aarch64 inline assembly definition for details.
> + */
> +#ifdef _M_ARM64
> +
> +static __forceinline void
> +spin_delay(void)
> +{
> +     /* Reference: https://learn.microsoft.com/en-us/cpp/intrinsics/arm64-intrinsics#BarrierRestrictions */
> +    __isb(_ARM64_BARRIER_SY);
> +}
> +#else
> +/*
> + * For x64, use _mm_pause intrinsic instead of rep nop.
> + */
>  static __forceinline void
>  spin_delay(void)
>  {
>      _mm_pause();
>  }

This continues to use a barrier, with a reference to a list of barrier
semantics that really don't seem to make a whole lot of sense in the context
of spin_delay(). If we want to emit this kind of barrier for now it's ok with
me, but it should be documented as just being a fairly random choice, rather
than a link that doesn't explain anything.


> +#endif
>  #else
>  static __forceinline void
>  spin_delay(void)
> @@ -623,9 +640,13 @@ spin_delay(void)
>  #include <intrin.h>
>  #pragma intrinsic(_ReadWriteBarrier)
>  
> -#define S_UNLOCK(lock)    \
> +#ifdef _M_ARM64
> +#define S_UNLOCK(lock) \
> +    do { __dmb(_ARM64_BARRIER_SY); (*(lock)) = 0; } while (0)
> +#else

This doesn't seem like the right way to implement this - why not use
InterlockedExchange(lock, 0)? That will do the write with barrier semantics.

Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления: