Re: [PATCH] Fix ARM64/MSVC atomic memory ordering issues on Win11 by adding explicit DMB ?barriers

Поиск
Список
Период
Сортировка
От Greg Burd
Тема Re: [PATCH] Fix ARM64/MSVC atomic memory ordering issues on Win11 by adding explicit DMB ?barriers
Дата
Msg-id b523fa2d-7683-40bc-bd94-5d4fd4264d46@app.fastmail.com
обсуждение исходный текст
Ответ на Re: [PATCH] Fix ARM64/MSVC atomic memory ordering issues on Win11 by adding explicit DMB ?barriers  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
On Mon, Dec 15, 2025, at 5:32 PM, Andres Freund wrote:
> Hi,
>
> On 2025-12-15 15:38:49 -0600, Nathan Bossart wrote:
>> +++ b/meson.build
>> @@ -2523,7 +2523,8 @@ int main(void)
>>  }
>>  '''
>>  
>> -  if cc.links(prog, name: '__crc32cb, __crc32ch, __crc32cw, and __crc32cd without -march=armv8-a+crc',
>> +  if (host_cpu == 'aarch64' and cc.get_id() == 'msvc') or \
>> +        cc.links(prog, name: '__crc32cb, __crc32ch, __crc32cw, and __crc32cd without -march=armv8-a+crc',
>>        args: test_c_args)
>>      # Use ARM CRC Extension unconditionally
>>      cdata.set('USE_ARMV8_CRC32C', 1)
>
> I still think this should have a comment explaining that we can
> unconditionally rely on crc32 support due to window's baseline requirements.
>
>
>> diff --git a/src/include/storage/s_lock.h b/src/include/storage/s_lock.h
>> index 7f8f566bd40..e62141abf0a 100644
>> --- a/src/include/storage/s_lock.h
>> +++ b/src/include/storage/s_lock.h
>> @@ -602,13 +602,21 @@ typedef LONG slock_t;
>>  
>>  #define SPIN_DELAY() spin_delay()
>>  
>> -/* If using Visual C++ on Win64, inline assembly is unavailable.
>> - * Use a _mm_pause intrinsic instead of rep nop.
>> - */
>> -#if defined(_WIN64)
>> +#ifdef _M_ARM64
>> +static __forceinline void
>> +spin_delay(void)
>> +{
>> +    /* Research indicates ISB is better than __yield() on AArch64. */
>> +    __isb(_ARM64_BARRIER_SY);
>
> It'd be good to link the research in some form or another. Otherwise it's
> harder to evolve the code in the future, because we don't know if the research
> was "I liked the color better" or "one is catastrophically slower than the
> other".
>
>> +}
>> +#elif defined(_WIN64)
>>  static __forceinline void
>>  spin_delay(void)
>>  {
>> +    /*
>> +     * If using Visual C++ on Win64, inline assembly is unavailable.
>> +     * Use a _mm_pause intrinsic instead of rep nop.
>> +     */
>>      _mm_pause();
>>  }
>>  #else
>> @@ -621,12 +629,19 @@ spin_delay(void)
>>  #endif
>>  
>>  #include <intrin.h>
>> +#ifdef _M_ARM64
>> +#pragma intrinsic(_InterlockedExchange)
>> +
>> +/* _ReadWriteBarrier() is insufficient on non-TSO architectures. */
>> +#define S_UNLOCK(lock) _InterlockedExchange(lock, 0)
>> +#else
>>  #pragma intrinsic(_ReadWriteBarrier)
>>  
>>  #define S_UNLOCK(lock)    \
>>      do { _ReadWriteBarrier(); (*(lock)) = 0; } while (0)
>>  
>>  #endif
>> +#endif
>
> The newline placement looks odd here. I'd add newlines around the #else.

Hey Andres, thanks for chiming in!  I agree with your suggestions and it looks like Nathan has already addressed them
toyour satisfaction.  I've applied v11 on my "unicorn" Win11 ARM64 MSVC system and it's testing now.
 

best.

-greg

> Greetings,
>
> Andres Freund



В списке pgsql-hackers по дате отправления: