Re: Trying out
| От | Greg Burd |
|---|---|
| Тема | Re: Trying out |
| Дата | |
| Msg-id | CC76554F-41AC-45AA-AF10-370FEC416498@greg.burd.me обсуждение исходный текст |
| Ответ на |
Re: Trying out |
| Список | pgsql-hackers |
On Nov 23 2025, at 4:08 pm, Thomas Munro <thomas.munro@gmail.com> wrote: > On Mon, Nov 24, 2025 at 4:23 AM Greg Burd <greg@burd.me> wrote: >> As mentioned on a separate thread about fixing ARM64 support when >> building with MSVC on Win11 [1] I tried out this patch. The reply on >> that thread had an issue with _mm_pause() in spin_delay(), it turns >> out we need to use __yield() [2]. I went ahead and fixed that, so >> ignore that patch on the other thread [1]. The new patch attached >> that layers on top of your work and supports that platform, there was >> one minor change that was required: >> >> >> #ifdef _MSC_VER >> >> /* >> * If using Visual C++ on Win64, inline assembly is >> unavailable. Use a >> * _mm_pause intrinsic instead of rep nop. For ARM64, use the __yield() >> * intrinsic which emits the YIELD instruction as a hint to >> the processor. >> */ >> #if defined(_M_ARM64) >> __yield(); >> #elif defined(_WIN64) >> _mm_pause(); >> #else >> /* See comment for gcc code. Same code, MASM syntax */ >> __asm rep nop; >> #endif >> #endif /* _MSC_VER */ > > That makes more intuitive sense... but I didn't know that people *do* > sometimes prefer instruction synchronisation barriers for spinlock > delays: > > https://stackoverflow.com/questions/70810121/why-does-hintspin-loop-use-isb-on-aarch64 > > When reading your patch I was pretty confused by that, because it said > it was fixing a barrier problem and apparently doing so in an > unprincipled place. I guess we really need to research the best delay > mechanism for our needs on this architecture, independently of the > compiler being used, and then write matching GCC and Visual Studio > versions of that? I think there were some threads about spinlock > performance on Linux + Graviton with graphs and theories... Interesting, I think I was rushing to get past that compile issue rather than optimizing. This sounds like yet another place where we should choose based on arch and it seems hint::spin_loop() does. >> tests are passing, best. > > Great news! Thanks. It sounds like if I could supply the missing > credible evidence of codegen quality on... all the computers, then I > think we'd be down to just: when can we pull the trigger and require > Visual Studio 2022 and do we trust /experimental:c11atomics? I'm in favor of the stdatomic approach. I can't speak to codegen quality on *all the platforms* or how *experimental* c11 atomics are when using MSVC. > FTR I had earlier shared some version of this patch with Dave when he > was trying to get his Windows/ARM system going, but I think my earlier > version was probably too broken. Sorry Dave. At that stage I was > also trying to do it as an option but keeping the existing stuff > around. Since then we adopted C11, so this is the all-in version. I > also hadn't understood a key part of the C11 memory model that your > RISC-V animal taught me and that c5d34f4a fixed, and you can see in > this patch set too, and I'm not sure if Visual Studio is like GCC or > Clang in that respect. Thanks for that work on RISC-V, I appreciate that! Much more digging to be done to answer those questions for sure. > It crossed my mind that this might even be > related to the problem you've noticed with barriers being missing, but > I haven't looked into that. BTW I believe we could actually change > our code NOT to rely on that, ie to follow the C11 memory model better > and declare eg PgAioHandle::status as atomic_uint8 or whatever (other > non-atomic access would be considered dependent and do the right thing > IIUC), but I'm not sure if it's necessary and that research project > can wait. Interesting. Yeah, let's do one thing and then move to the next, but I do like the idea. best. -greg
В списке pgsql-hackers по дате отправления: