Обсуждение: postgres has no spinlock support on riscv rv64imafdc
Running ona sifive hyifive unleashed riscv board with uname -a of:
Linux freedom-u540 5.2.9 #1 SMP Tue Oct 15 19:54:42 UTC 2019 riscv64 riscv64 riscv64 GNU/Linux
where each processor is:
processor : 0
hart : 1
isa : rv64imafdc
mmu : sv39
uarch : sifive,u54-mc
hart : 1
isa : rv64imafdc
mmu : sv39
uarch : sifive,u54-mc
Building Postgres from scratch on the riscv board, no fancy cross compilation, I get:
PostgreSQL does not have native spinlock support on this platform. To continue the compilation, rerun configure using --disable-spinlocks. However, performance will be poor. Please report this to pgsql-bugs@lists.postgresql.org.
Robert Henry <rrh.henry@gmail.com> writes: > Building Postgres from scratch on the riscv board, no fancy cross > compilation, I get: > PostgreSQL does not have native spinlock support on this platform. To > continue the compilation, rerun configure using --disable-spinlocks. > However, performance will be poor. Please report this to > pgsql-bugs@lists.postgresql.org. The last we heard about RISC-V was this thread: https://www.postgresql.org/message-id/flat/20161119230312.GA8656%40redhat.com If things seem to work well now with --disable-spinlocks, maybe you could test the patch suggested in that thread. TBH, though, my preference would be for some assembly code rather than relying on GCC builtins as Richard's patch did. regards, tom lane
Hi, On 2019-10-18 09:00:23 +0200, Tom Lane wrote: > TBH, though, my preference would be for some assembly code rather than > relying on GCC builtins as Richard's patch did. -1. I think there's good arguments for using inline assembly on platforms where we've done so historically, and where we have to support old compilers without good support for intrinsics/builtins. But I see very little reason for adding more such cases for newer platforms - doing so correctly and efficiently is substantial work and fragile. Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2019-10-18 09:00:23 +0200, Tom Lane wrote: >> TBH, though, my preference would be for some assembly code rather than >> relying on GCC builtins as Richard's patch did. > -1. I think there's good arguments for using inline assembly on > platforms where we've done so historically, and where we have to support > old compilers without good support for intrinsics/builtins. But I see > very little reason for adding more such cases for newer platforms - > doing so correctly and efficiently is substantial work and fragile. The reason I'm skeptical of that line of argument is that gcc's track record for support of these intrinsics on non-mainstream architectures is just sucky. Now maybe, somebody was careful and it all works great on RISC-V. But IMO, the burden of proof is to show that the intrinsics work, not to show that they don't. I recall Noah's recent argument in a related context that with an asm implementation, anybody with a copy of the architecture manual can review/verify the code; and such a verification doesn't depend on which compiler version you're using. If we depend on gcc intrinsics, we've basically got zero confidence about anything except from testing. Yeah, you could make a similar argument about any sort of compiler bug ... but gcc has *earned* my lack of trust in this particular area. regards, tom lane
Hi, On 2019-10-20 00:23:07 -0400, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > On 2019-10-18 09:00:23 +0200, Tom Lane wrote: > >> TBH, though, my preference would be for some assembly code rather than > >> relying on GCC builtins as Richard's patch did. > > > -1. I think there's good arguments for using inline assembly on > > platforms where we've done so historically, and where we have to support > > old compilers without good support for intrinsics/builtins. But I see > > very little reason for adding more such cases for newer platforms - > > doing so correctly and efficiently is substantial work and fragile. > > The reason I'm skeptical of that line of argument is that gcc's track > record for support of these intrinsics on non-mainstream architectures > is just sucky. Now maybe, somebody was careful and it all works great > on RISC-V. But IMO, the burden of proof is to show that the intrinsics > work, not to show that they don't. I agree there've been problems. But I don't think one can make a lot of conclusions from the intrinsics quality for dying platforms when judging new platforms. Most if not all new platforms with a gcc port that PG would be run on are going to be multi-core platforms - if intrinsics are broken, there'll be more problems. Furthermore, we're fully exposed to the intrinsics quality due to atomics anyway - a lot more even, because there's a shrinking amount of contended spinlocks, and a lot more contended lwlocks etc (which use atomics). And lastly, our spinlock implementations have been far from perfect too. Greetings, Andres Freund
On Saturday, October 19, 2019, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andres Freund <andres@anarazel.de> writes:
> On 2019-10-18 09:00:23 +0200, Tom Lane wrote:
>> TBH, though, my preference would be for some assembly code rather than
>> relying on GCC builtins as Richard's patch did.
> -1. I think there's good arguments for using inline assembly on
> platforms where we've done so historically, and where we have to support
> old compilers without good support for intrinsics/builtins. But I see
> very little reason for adding more such cases for newer platforms -
> doing so correctly and efficiently is substantial work and fragile.
The reason I'm skeptical of that line of argument is that gcc's track
record for support of these intrinsics on non-mainstream architectures
is just sucky. Now maybe, somebody was careful and it all works great
on RISC-V. But IMO, the burden of proof is to show that the intrinsics
work, not to show that they don't.
I recall Noah's recent argument in a related context that with an
asm implementation, anybody with a copy of the architecture manual
can review/verify the code; and such a verification doesn't depend
on which compiler version you're using. If we depend on gcc intrinsics,
we've basically got zero confidence about anything except from testing.
Do our regression tests exercise the quality of the intrinsic implementation?
merlin
Merlin Moncure <mmoncure@gmail.com> writes: > On Saturday, October 19, 2019, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> The reason I'm skeptical of that line of argument is that gcc's track >> record for support of these intrinsics on non-mainstream architectures >> is just sucky. Now maybe, somebody was careful and it all works great >> on RISC-V. But IMO, the burden of proof is to show that the intrinsics >> work, not to show that they don't. > Do our regression tests exercise the quality of the intrinsic > implementation? Only indirectly, to the extent that a broken intrinsic might result in misbehavior of concurrent operations. If you dig through the git history for s_lock.h and the atomics files, you can find plenty of evidence that it can take awhile for us to notice bugs in this area. The git history is pretty illuminating, actually (he says having just searched through it). There is plenty of evidence there that both the manually-written-asm and the compiler-intrinsics approaches can have nasty, hard-to-pin-down bugs. The thing that makes me prefer the former is that once we do identify a bug, we can fix it. regards, tom lane
On Fri, Dec 13, 2019 at 10:59 AM Andres Freund <andres@anarazel.de> wrote: > On 2019-10-20 00:23:07 -0400, Tom Lane wrote: > > Andres Freund <andres@anarazel.de> writes: > > > On 2019-10-18 09:00:23 +0200, Tom Lane wrote: > > >> TBH, though, my preference would be for some assembly code rather than > > >> relying on GCC builtins as Richard's patch did. > > > > > -1. I think there's good arguments for using inline assembly on > > > platforms where we've done so historically, and where we have to support > > > old compilers without good support for intrinsics/builtins. But I see > > > very little reason for adding more such cases for newer platforms - > > > doing so correctly and efficiently is substantial work and fragile. > > > > The reason I'm skeptical of that line of argument is that gcc's track > > record for support of these intrinsics on non-mainstream architectures > > is just sucky. Now maybe, somebody was careful and it all works great > > on RISC-V. But IMO, the burden of proof is to show that the intrinsics > > work, not to show that they don't. > > I agree there've been problems. But I don't think one can make a lot of > conclusions from the intrinsics quality for dying platforms when judging > new platforms. Most if not all new platforms with a gcc port that PG > would be run on are going to be multi-core platforms - if intrinsics are > broken, there'll be more problems. Furthermore, we're fully exposed to > the intrinsics quality due to atomics anyway - a lot more even, because > there's a shrinking amount of contended spinlocks, and a lot more > contended lwlocks etc (which use atomics). And lastly, our spinlock > implementations have been far from perfect too. As a data point: the obvious change builds and passes basic testing with flying colours, on FreeBSD + clang 11 running under qemu. RISC-V has been "non-experimental" since clang 9. FWIW I've signed up to the preorder list to try to get my hands on some BeagleV hardware to test this stuff for real ASAP... -#if defined(__arm__) || defined(__arm) || defined(__aarch64__) || defined(__aarch64) +#if defined(__arm__) || defined(__arm) || defined(__aarch64__) || defined(__aarch64) || defined(__riscv)
Thomas Munro <thomas.munro@gmail.com> writes: > As a data point: the obvious change builds and passes basic testing > with flying colours, on FreeBSD + clang 11 running under qemu. RISC-V > has been "non-experimental" since clang 9. FWIW I've signed up to the > preorder list to try to get my hands on some BeagleV hardware to test > this stuff for real ASAP... Yeah, I'm on that list too ... crickets so far. regards, tom lane
On Fri, Apr 16, 2021 at 2:04 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Thomas Munro <thomas.munro@gmail.com> writes: > > As a data point: the obvious change builds and passes basic testing > > with flying colours, on FreeBSD + clang 11 running under qemu. RISC-V > > has been "non-experimental" since clang 9. FWIW I've signed up to the > > preorder list to try to get my hands on some BeagleV hardware to test > > this stuff for real ASAP... > > Yeah, I'm on that list too ... crickets so far. I passed this change on to the brave souls who are bringing up the whole universe of open source software on this new architecture, in the FreeBSD ports tree. PostgreSQL's failure to compile was blocking hundreds of dependent packages so it seemed good to at least try something at this early stage (other databases are still blockers; apparently you can trust database hackers to use weird arch-specific stuff). It'll be interesting to see if any reports of brokenness come in from that, but for now I just know that a whole mountain of new stuff now builds there. One thing I noticed while worrying about whether it might have insufficiently strong barriers is that Clang and GCC are generating different code: __sync_lock_test_and_set(): Clang 11: amoswap.w.aqrl a0,a1,(a0) GCC 9: amoswap.w.aq a4,a4,(a5) __sync_lock_release(): Clang 11: fence rw,w li a1,0 sw a1,0(a0) GCC 9: fence iorw,ow amoswap.w zero,zero,(a5) S_LOCK() looks a bit like figure A7 of the manual[1], with at least acquire semantics (but also release on Clang), and S_UNLOCK() looks a bit like figure A8, with an explicit release. I don't immediately know why acquire and release semantics wouldn't be enough here. Anyone? I do see that ARM, Itanium and PowerPC include a full barrier, though in different places: on ARM, __sync_lock_test_and_set() on ARM generates ldxr, stxr, dmb (= full memory barrier) rather than using ldaxr (= acquire) and stlxr (= release), and this seems to be since [2], which is quite interesting and might be a clue; on Itanium and PowerPC, it's explicit in S_UNLOCK(). Some of the questions I have are: can anyone with real hardware access reproduce the failure reported back in 2016[3], or was that entirely due to the pre-alpha toolchain? If it can be reproduced, is it with GCC only and not Clang, and does adding an initial __sync_synchronize() help? [1] https://riscv.org/wp-content/uploads/2019/06/riscv-spec.pdf [2] https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=a96297a28065d27559d17ebe1e0eda308a05e965 [3] https://www.postgresql.org/message-id/20161120184942.GP11243%40redhat.com