Bug 2269010
| Summary: | clang-18 seems to miscompile systemd code, unit tests fail | ||
|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Zbigniew Jędrzejewski-Szmek <zbyszek> |
| Component: | clang | Assignee: | Tom Stellard <tstellar> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 40 | CC: | agurenko, airlied, jchecahi, kkleine, npopov, sergesanspaille, siddharth.kde, tbaeder, tstellar, tuliom |
| Target Milestone: | --- | Flags: | tstellar:
mirror+
|
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2024-05-28 08:52:49 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Zbigniew Jędrzejewski-Szmek
2024-03-11 13:58:05 UTC
Standalone reproducer: https://paste.centos.org/view/610fd24f This aborts when using clang -O2 (clang 19 here), works with -O2 and with gcc (both O0 and O2). CC @nikic since it only happens with optimizations. The main function just gets optimized to:
; Function Attrs: noreturn nounwind uwtable
define dso_local noundef i32 @main() local_unnamed_addr #1 {
tail call void @abort() #3
unreachable
}
This is not related to LLVM optimizations, the difference also exists with -O2 -Xclang -disable-llvm-optzns Looking at the diff between the two, I think the relevant part is that the union is initialed differently. With -O0 there is a copy from a constant (of the whole union). With -O2 only the first member of the union is initialized and the rest stores undef values. I think this is https://github.com/llvm/llvm-project/issues/78034. Clangs interpretation of how union initialization with = {} works (chosen when this was a language extension, and I believe in a way that aligns with C++ semantics) does not match what was standardized in C23, which requires this to actually zero the whole union rather than just the first member. Based on the discussion on the issue this looks like a complex issue, a full fix to that issue looks quite involved, but maybe we can figure out what change to clang caused the codegen difference (going from memcpy to full expansion) here and revert that to somewhat mitigate the issue. I had a hunch, and it looks like this patch does indeed "fix" the issue: https://github.com/llvm/llvm-project/pull/84230 (Doesn't really fix anything, but returns it to the status quo of mostly working out in practice.) Is this a blocker for getting clang18 into f40 ? That is a good question. Do you think that the issue affects other code? If it's at least moderately common, I don't think we should publish a beta candidate with that version of the compiler. I have not seen this issue come up in our rebuild testing, so it does not seem to be widespread. Does systemd need clang to build in Fedora or is this failure only in upstream CI? It seems we have a workaround that we could backport now. I could pull this workaround into f40 now if we are going to block the update on it, but if we aren't going to block the update, I'd prefer to wait for something to land in upstream rather than doing something Fedora specific. No, systemd does not use clang. We compile it with clang for testing only, and in this particular case, I did it to test the clang update on Fedora. OK, so let's merge this. LLVM with the fix has been updated in rawhide and f40 since, so this should be fixed. |