GCC does not have target support for -fstack-clash-protection on riscv64. As a result, some generic support is used, and it so happens that results in miscompilation. Stack pointer register updates are not correctly ordered with regards to stack writes, so signal delivery probably does not work as expected. We need to disable this hardening feature until upstream support arrives. Reproducible: Always
Here are some notes I wrote down to a GCC developer. In glibc, the realpath function starts like this: 000000000003cb5e <realpath@@GLIBC_2.27>: __GI___realpath(): 3cb5e: 81010113 addi sp,sp,-2032 3cb62: 7d313423 sd s3,1992(sp) 3cb66: 79fd lui s3,0xfffff 3cb68: 7e813023 sd s0,2016(sp) 3cb6c: 7c913c23 sd s1,2008(sp) 3cb70: 7f010413 addi s0,sp,2032 3cb74: 35098793 addi a5,s3,848 # fffffffffffff350 <__libc_initial+0xffffffffffe8946a> 3cb78: 74fd lui s1,0xfffff 3cb7a: 008789b3 add s3,a5,s0 3cb7e: f9048793 addi a5,s1,-112 # ffffffffffffef90 <__libc_initial+0xffffffffffe890aa> 3cb82: 008784b3 add s1,a5,s0 3cb86: 77fd lui a5,0xfffff 3cb88: 7d413023 sd s4,1984(sp) 3cb8c: 7b513c23 sd s5,1976(sp) 3cb90: 7e113423 sd ra,2024(sp) 3cb94: 7d213823 sd s2,2000(sp) 3cb98: 7b613823 sd s6,1968(sp) 3cb9c: 7b713423 sd s7,1960(sp) 3cba0: 7b813023 sd s8,1952(sp) 3cba4: 79913c23 sd s9,1944(sp) 3cba8: 79a13823 sd s10,1936(sp) 3cbac: 79b13423 sd s11,1928(sp) 3cbb0: 34878793 addi a5,a5,840 # fffffffffffff348 <__libc_initial+0xffffffffffe89462> 3cbb4: 40000713 li a4,1024 3cbb8: 00132a17 auipc s4,0x132 3cbbc: ae0a3a03 ld s4,-1312(s4) # 16e698 <__stack_chk_guard> 3cbc0: 01098893 addi a7,s3,16 3cbc4: 42098693 addi a3,s3,1056 3cbc8: b8040a93 addi s5,s0,-1152 3cbcc: 97a2 add a5,a5,s0 3cbce: 000a3603 ld a2,0(s4) 3cbd2: f8c43423 sd a2,-120(s0) 3cbd6: 4601 li a2,0 3cbd8: 3d14b023 sd a7,960(s1) 3cbdc: 3ce4b423 sd a4,968(s1) 3cbe0: 7cd4b823 sd a3,2000(s1) 3cbe4: 7ce4bc23 sd a4,2008(s1) 3cbe8: b7543823 sd s5,-1168(s0) 3cbec: b6e43c23 sd a4,-1160(s0) 3cbf0: e38c sd a1,0(a5) 3cbf2: b0010113 addi sp,sp,-1280 I can't read RISC-V assembly, but it seems to me that the frame setup completes at the last instruction at 3cbf2. Total frame size is expected to be larger than 3 KiB (three scratch buffers are on stack). It looks like valgrind flags the access at 3cbd8 as below the stack pointer: ==1791141== Invalid write of size 8 ==1791141== at 0x485DBD8: realpath@@GLIBC_2.27 (scratch_buffer.h:77) ==1791141== by 0x4801167: main (valgrind-test.c:43) ==1791141== Address 0x1ffefff080 is on thread 1's stack ==1791141== 1216 bytes below stack pointer While this might be fine for a stack-clash probes, there are further accesses around this location which I think are below SP as well, and I doubt that all of them are stack-clash probes. Some of them got to be local variable setup. Those variables get clobbered if a signal handler arrives between the initialization and the frame setup completion at 3cbf2.
Florian is correct. RISC-V does not currently have backend support for -fstack-clash-protection. It's on the TODO list for 2024 and will likely be covered by myself or someone on my team. While I'd like to have it in time for gcc-14, there's just not enough time. Once we've got something working, I'll happily coordinate with Jakub to determine if the bits are backportable into Fedora. Until the backend bits are available disabling stack-clash in redhat-rpm-config for rv64 definitely makes sense. I haven't dug into Florian's example, but I would hazard a guess it's falling back to the Ada -fstack-check path given the lack of -fstack-clash-protection support. As such it'll have all the limitations/flaws with -fstack-check that I outlined in an old blog post.
This bug appears to have been reported against 'rawhide' during the Fedora Linux 40 development cycle. Changing version to 40.
The original valgrind error was during the glibc valgrind smoke test: 'GCONV_PATH=…/build-riscv64-redhat-linux/iconvdata LOCPATH=…/build-riscv64-redhat-linux/localedata LC_ALL=C' \ '…/build-riscv64-redhat-linux:…/build-riscv64-redhat-linux/math:…/build-riscv64-redhat-linux/elf:…/build-riscv64-redhat-linux/dlfcn:…/build-riscv64-redhat-linux/nss:…/build-riscv64-redhat-linux/nis:…/build-riscv64-redhat-linux/rt:…/build-riscv64-redhat-linux/resolv:…/build-riscv64-redhat-linux/mathvec:…/build-riscv64-redhat-linux/support:…/build-riscv64-redhat-linux/nptl' \ …/build-riscv64-redhat-linux/elf/valgrind-test \ > …/build-riscv64-redhat-linux/elf/tst-valgrind-smoke.out; \ ../scripts/evaluate-test.sh elf/tst-valgrind-smoke $? false \ false > …/build-riscv64-redhat-linux/elf/tst-valgrind-smoke.test-result ==1791141== Invalid write of size 8 ==1791141== at 0x485DBD8: realpath@@GLIBC_2.27 (scratch_buffer.h:77) ==1791141== by 0x4801167: main (valgrind-test.c:43) ==1791141== Address 0x1ffefff080 is on thread 1's stack ==1791141== 1216 bytes below stack pointer ==1791141== ==1791141== Invalid write of size 8 ==1791141== at 0x485DBDC: realpath@@GLIBC_2.27 (scratch_buffer.h:78) ==1791141== by 0x4801167: main (valgrind-test.c:43) ==1791141== Address 0x1ffefff088 is on thread 1's stack ==1791141== 1208 bytes below stack pointer I don't have access to the build log anymore.
Note that valgrind upstream doesn't support risc-v yet (although there are unreviewed patches): https://bugs.kde.org/show_bug.cgi?id=468575 That said, even if it did it seems a good idea to disable -fstack-clash-protection on risc-v if gcc upstream doesn't support it on that architecture.
Oops, I forgot to close the bug. I made the change back in October 2023.
Jeff, with c65046ff2ef0a9a46e59bc0b3369b2d226f6a239 [0] in GCC 14.1 are we fine re-enabled -fstack-clash-protection with GCC toolchain? I am modifying redhat-rpm-config for Fedora 41, and remembered this is. I would love to re-enable this before sending a large number of builds. [0] https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=c65046ff2ef