Bug 2283132

Summary: rust-proptest-derive: FTBFS on s390x due to OOM
Product: [Fedora] Fedora Reporter: Ben Beasley <code>
Component: llvmAssignee: Tom Stellard <tstellar>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 42CC: amulhern, decathorpe, dmalcolm, igor.raits, jakub, jchecahi, jistone, kkleine, npopov, rust-sig, scottt.tw, sergesanspaille, siddharth.kde, suraj.ghimire7, tbaeder, TicoTimo, tstellar, tuliom
Target Milestone: ---Flags: jistone: mirror+
Target Release: ---   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2025-04-18 16:18:50 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2301264    
Attachments:
Description Flags
reproducer
none
prior reference none

Description Ben Beasley 2024-05-24 18:29:59 UTC
On s390x, rust-proptest-derive now fails to build from source because it runs out of memory.

   Compiling proptest-derive v0.4.0 (/builddir/build/BUILD/proptest-derive-0.4.0)
     Running `/usr/bin/rustc --crate-name proptest_derive --edition=2018 src/lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type proc-macro --emit=dep-info,link -C prefer-dynamic -C opt-level=3 -C embed-bitcode=no -C codegen-units=1 -C debuginfo=2 -C metadata=d3eabb4646f54b6e -C extra-filename=-d3eabb4646f54b6e --out-dir /builddir/build/BUILD/proptest-derive-0.4.0/target/rpm/deps -L dependency=/builddir/build/BUILD/proptest-derive-0.4.0/target/rpm/deps --extern proc_macro2=/builddir/build/BUILD/proptest-derive-0.4.0/target/rpm/deps/libproc_macro2-399dd7be279807fc.rlib --extern quote=/builddir/build/BUILD/proptest-derive-0.4.0/target/rpm/deps/libquote-e3c4a9152680cd84.rlib --extern syn=/builddir/build/BUILD/proptest-derive-0.4.0/target/rpm/deps/libsyn-7bc94960062f99ef.rlib --extern proc_macro -Copt-level=3 -Cdebuginfo=2 -Ccodegen-units=1 -Cstrip=none -Clink-arg=-specs=/usr/lib/rpm/redhat/redhat-package-notes --cap-lints=warn`
error: could not compile `proptest-derive` (lib)
Caused by:
  process didn't exit successfully: `/usr/bin/rustc --crate-name proptest_derive --edition=2018 src/lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type proc-macro --emit=dep-info,link -C prefer-dynamic -C opt-level=3 -C embed-bitcode=no -C codegen-units=1 -C debuginfo=2 -C metadata=d3eabb4646f54b6e -C extra-filename=-d3eabb4646f54b6e --out-dir /builddir/build/BUILD/proptest-derive-0.4.0/target/rpm/deps -L dependency=/builddir/build/BUILD/proptest-derive-0.4.0/target/rpm/deps --extern proc_macro2=/builddir/build/BUILD/proptest-derive-0.4.0/target/rpm/deps/libproc_macro2-399dd7be279807fc.rlib --extern quote=/builddir/build/BUILD/proptest-derive-0.4.0/target/rpm/deps/libquote-e3c4a9152680cd84.rlib --extern syn=/builddir/build/BUILD/proptest-derive-0.4.0/target/rpm/deps/libsyn-7bc94960062f99ef.rlib --extern proc_macro -Copt-level=3 -Cdebuginfo=2 -Ccodegen-units=1 -Cstrip=none -Clink-arg=-specs=/usr/lib/rpm/redhat/redhat-package-notes --cap-lints=warn` (signal: 9, SIGKILL: kill)
error: Bad exit status from /var/tmp/rpm-tmp.4ly1Hg (%build)

This is probably due to a bug, as the memory usage seems excessive compared to other platforms. I observed the crate building in a fedora-rawhide-s390x mock chroot, under qemu-user-static emulation, and saw the memory usage gradually but steadily grow to 30 GB resident and beyond. (The build is still running.) Note that a typical s390x builder has ~25GB of memory.

I can reproduce this problem on F41/Rawhide, F40, and F39, but not on EPEL9.

Reproducible: Always

Comment 1 Fabio Valentini 2024-05-26 15:07:01 UTC
This package built fine on s390x up until the F40 mass rebuild, so it must have been a change that landed later than that.
The fact that this builds fine on EPEL9 makes me think it's caused by a change in Rust 1.76+.

Comment 2 Josh Stone 2024-05-28 22:09:54 UTC
I can reproduce this on F40 with rust-1.78.0, while 1.77.0 compiles with just a little over 2GB memory used at peak.

The "good" news is that I can also reproduce this with upstream (rustup) toolchains between 1.77 and 1.78, so it's not just Fedora.

Comment 3 Josh Stone 2024-05-30 18:42:11 UTC
cargo-bisect-rustc narrowed it down to this change:
https://github.com/rust-lang/rust/pull/117206

But that doesn't necessarily mean that MIR JumpThreading pass is buggy, especially since this only appears on s390x. AFAICT, the memory growth is all in the LLVM backend. I'll try to capture the IR and see if it's reproducible directly in LLVM tools.

Comment 4 Josh Stone 2024-06-04 21:08:03 UTC
Created attachment 2036338 [details]
reproducer

This LLVM bitcode behaves poorly with "opt -O3". On my 64GB system, this gets up to about 20GB of memory and chews CPU for a while (~1hr), then spikes even more memory use until it's killed OOM.

Comment 5 Josh Stone 2024-06-04 21:10:49 UTC
Created attachment 2036340 [details]
prior reference

For comparison, this is the bitcode from before rust#117206. It takes a little over a minute for "opt -O3" with about 700MB peak memory use.

Comment 6 Nikita Popov 2024-06-06 13:12:41 UTC
Looks like a runaway inlining issue. There are probably other ways to mitigate this, but https://github.com/llvm/llvm-project/pull/94612 would certainly fix it, by removing the ridiculously high inlining thresholds on s390x. I doubt upstream will accept it though, but I figured I'd at least try.

Comment 7 Josh Stone 2024-06-06 19:50:35 UTC
Ok, (approximately) reversing that multiplier with --inline-threshold=75 reduces opt to ~7s and 158MB memory.

Hacking the same IR to x86_64 runs opt in ~3s and 102MB. But going the other way with x86_64 --inline-threshold=675 doesn't blow up either, only ~5.5s and 141MB, so that's not the whole story.

There's also the SystemZTTIImpl::adjustInliningThreshold bonus 150 that gets applied for each arg memcpy, which I would guess comes up a lot with Rust. That's added *before* the multiplier too!

Comment 8 Josh Stone 2024-06-06 20:18:32 UTC
With your patch to remove the multiplier, opt -O3 runs in 3.4s and 96MB. The previous "good" reference speeds up a lot as well to 3.1s and also 96MB.

Reducing the multiplier to 2 is also bearable with these inputs, 6s 126MB on the "bad" input and 5.6s 123MB on the "good".

Comment 9 Aoife Moloney 2025-02-26 13:03:05 UTC
This bug appears to have been reported against 'rawhide' during the Fedora Linux 42 development cycle.
Changing version to 42.

Comment 10 Fabio Valentini 2025-04-18 16:18:50 UTC
This should be resolved with LLVM upstream changes (can't find the link to the upstream change right now).

Comment 11 Josh Stone 2025-04-18 16:59:21 UTC
I believe it was this change, which is in LLVM 20: https://github.com/llvm/llvm-project/pull/106058