Re: [PATCH] Fix ARM64/MSVC atomic memory ordering issues on Win11 by adding explicit DMB ?barriers
| От | Greg Burd |
|---|---|
| Тема | Re: [PATCH] Fix ARM64/MSVC atomic memory ordering issues on Win11 by adding explicit DMB ?barriers |
| Дата | |
| Msg-id | b523fa2d-7683-40bc-bd94-5d4fd4264d46@app.fastmail.com обсуждение исходный текст |
| Ответ на | Re: [PATCH] Fix ARM64/MSVC atomic memory ordering issues on Win11 by adding explicit DMB ?barriers (Andres Freund <andres@anarazel.de>) |
| Список | pgsql-hackers |
On Mon, Dec 15, 2025, at 5:32 PM, Andres Freund wrote:
> Hi,
>
> On 2025-12-15 15:38:49 -0600, Nathan Bossart wrote:
>> +++ b/meson.build
>> @@ -2523,7 +2523,8 @@ int main(void)
>> }
>> '''
>>
>> - if cc.links(prog, name: '__crc32cb, __crc32ch, __crc32cw, and __crc32cd without -march=armv8-a+crc',
>> + if (host_cpu == 'aarch64' and cc.get_id() == 'msvc') or \
>> + cc.links(prog, name: '__crc32cb, __crc32ch, __crc32cw, and __crc32cd without -march=armv8-a+crc',
>> args: test_c_args)
>> # Use ARM CRC Extension unconditionally
>> cdata.set('USE_ARMV8_CRC32C', 1)
>
> I still think this should have a comment explaining that we can
> unconditionally rely on crc32 support due to window's baseline requirements.
>
>
>> diff --git a/src/include/storage/s_lock.h b/src/include/storage/s_lock.h
>> index 7f8f566bd40..e62141abf0a 100644
>> --- a/src/include/storage/s_lock.h
>> +++ b/src/include/storage/s_lock.h
>> @@ -602,13 +602,21 @@ typedef LONG slock_t;
>>
>> #define SPIN_DELAY() spin_delay()
>>
>> -/* If using Visual C++ on Win64, inline assembly is unavailable.
>> - * Use a _mm_pause intrinsic instead of rep nop.
>> - */
>> -#if defined(_WIN64)
>> +#ifdef _M_ARM64
>> +static __forceinline void
>> +spin_delay(void)
>> +{
>> + /* Research indicates ISB is better than __yield() on AArch64. */
>> + __isb(_ARM64_BARRIER_SY);
>
> It'd be good to link the research in some form or another. Otherwise it's
> harder to evolve the code in the future, because we don't know if the research
> was "I liked the color better" or "one is catastrophically slower than the
> other".
>
>> +}
>> +#elif defined(_WIN64)
>> static __forceinline void
>> spin_delay(void)
>> {
>> + /*
>> + * If using Visual C++ on Win64, inline assembly is unavailable.
>> + * Use a _mm_pause intrinsic instead of rep nop.
>> + */
>> _mm_pause();
>> }
>> #else
>> @@ -621,12 +629,19 @@ spin_delay(void)
>> #endif
>>
>> #include <intrin.h>
>> +#ifdef _M_ARM64
>> +#pragma intrinsic(_InterlockedExchange)
>> +
>> +/* _ReadWriteBarrier() is insufficient on non-TSO architectures. */
>> +#define S_UNLOCK(lock) _InterlockedExchange(lock, 0)
>> +#else
>> #pragma intrinsic(_ReadWriteBarrier)
>>
>> #define S_UNLOCK(lock) \
>> do { _ReadWriteBarrier(); (*(lock)) = 0; } while (0)
>>
>> #endif
>> +#endif
>
> The newline placement looks odd here. I'd add newlines around the #else.
Hey Andres, thanks for chiming in! I agree with your suggestions and it looks like Nathan has already addressed them
toyour satisfaction. I've applied v11 on my "unicorn" Win11 ARM64 MSVC system and it's testing now.
best.
-greg
> Greetings,
>
> Andres Freund
В списке pgsql-hackers по дате отправления: