On s390x, rust-proptest-derive now fails to build from source because it runs out of memory. Compiling proptest-derive v0.4.0 (/builddir/build/BUILD/proptest-derive-0.4.0) Running `/usr/bin/rustc --crate-name proptest_derive --edition=2018 src/lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type proc-macro --emit=dep-info,link -C prefer-dynamic -C opt-level=3 -C embed-bitcode=no -C codegen-units=1 -C debuginfo=2 -C metadata=d3eabb4646f54b6e -C extra-filename=-d3eabb4646f54b6e --out-dir /builddir/build/BUILD/proptest-derive-0.4.0/target/rpm/deps -L dependency=/builddir/build/BUILD/proptest-derive-0.4.0/target/rpm/deps --extern proc_macro2=/builddir/build/BUILD/proptest-derive-0.4.0/target/rpm/deps/libproc_macro2-399dd7be279807fc.rlib --extern quote=/builddir/build/BUILD/proptest-derive-0.4.0/target/rpm/deps/libquote-e3c4a9152680cd84.rlib --extern syn=/builddir/build/BUILD/proptest-derive-0.4.0/target/rpm/deps/libsyn-7bc94960062f99ef.rlib --extern proc_macro -Copt-level=3 -Cdebuginfo=2 -Ccodegen-units=1 -Cstrip=none -Clink-arg=-specs=/usr/lib/rpm/redhat/redhat-package-notes --cap-lints=warn` error: could not compile `proptest-derive` (lib) Caused by: process didn't exit successfully: `/usr/bin/rustc --crate-name proptest_derive --edition=2018 src/lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type proc-macro --emit=dep-info,link -C prefer-dynamic -C opt-level=3 -C embed-bitcode=no -C codegen-units=1 -C debuginfo=2 -C metadata=d3eabb4646f54b6e -C extra-filename=-d3eabb4646f54b6e --out-dir /builddir/build/BUILD/proptest-derive-0.4.0/target/rpm/deps -L dependency=/builddir/build/BUILD/proptest-derive-0.4.0/target/rpm/deps --extern proc_macro2=/builddir/build/BUILD/proptest-derive-0.4.0/target/rpm/deps/libproc_macro2-399dd7be279807fc.rlib --extern quote=/builddir/build/BUILD/proptest-derive-0.4.0/target/rpm/deps/libquote-e3c4a9152680cd84.rlib --extern syn=/builddir/build/BUILD/proptest-derive-0.4.0/target/rpm/deps/libsyn-7bc94960062f99ef.rlib --extern proc_macro -Copt-level=3 -Cdebuginfo=2 -Ccodegen-units=1 -Cstrip=none -Clink-arg=-specs=/usr/lib/rpm/redhat/redhat-package-notes --cap-lints=warn` (signal: 9, SIGKILL: kill) error: Bad exit status from /var/tmp/rpm-tmp.4ly1Hg (%build) This is probably due to a bug, as the memory usage seems excessive compared to other platforms. I observed the crate building in a fedora-rawhide-s390x mock chroot, under qemu-user-static emulation, and saw the memory usage gradually but steadily grow to 30 GB resident and beyond. (The build is still running.) Note that a typical s390x builder has ~25GB of memory. I can reproduce this problem on F41/Rawhide, F40, and F39, but not on EPEL9. Reproducible: Always
This package built fine on s390x up until the F40 mass rebuild, so it must have been a change that landed later than that. The fact that this builds fine on EPEL9 makes me think it's caused by a change in Rust 1.76+.
I can reproduce this on F40 with rust-1.78.0, while 1.77.0 compiles with just a little over 2GB memory used at peak. The "good" news is that I can also reproduce this with upstream (rustup) toolchains between 1.77 and 1.78, so it's not just Fedora.
cargo-bisect-rustc narrowed it down to this change: https://github.com/rust-lang/rust/pull/117206 But that doesn't necessarily mean that MIR JumpThreading pass is buggy, especially since this only appears on s390x. AFAICT, the memory growth is all in the LLVM backend. I'll try to capture the IR and see if it's reproducible directly in LLVM tools.
Created attachment 2036338 [details] reproducer This LLVM bitcode behaves poorly with "opt -O3". On my 64GB system, this gets up to about 20GB of memory and chews CPU for a while (~1hr), then spikes even more memory use until it's killed OOM.
Created attachment 2036340 [details] prior reference For comparison, this is the bitcode from before rust#117206. It takes a little over a minute for "opt -O3" with about 700MB peak memory use.
Looks like a runaway inlining issue. There are probably other ways to mitigate this, but https://github.com/llvm/llvm-project/pull/94612 would certainly fix it, by removing the ridiculously high inlining thresholds on s390x. I doubt upstream will accept it though, but I figured I'd at least try.
Ok, (approximately) reversing that multiplier with --inline-threshold=75 reduces opt to ~7s and 158MB memory. Hacking the same IR to x86_64 runs opt in ~3s and 102MB. But going the other way with x86_64 --inline-threshold=675 doesn't blow up either, only ~5.5s and 141MB, so that's not the whole story. There's also the SystemZTTIImpl::adjustInliningThreshold bonus 150 that gets applied for each arg memcpy, which I would guess comes up a lot with Rust. That's added *before* the multiplier too!
With your patch to remove the multiplier, opt -O3 runs in 3.4s and 96MB. The previous "good" reference speeds up a lot as well to 3.1s and also 96MB. Reducing the multiplier to 2 is also bearable with these inputs, 6s 126MB on the "bad" input and 5.6s 123MB on the "good".
This bug appears to have been reported against 'rawhide' during the Fedora Linux 42 development cycle. Changing version to 42.
This should be resolved with LLVM upstream changes (can't find the link to the upstream change right now).
I believe it was this change, which is in LLVM 20: https://github.com/llvm/llvm-project/pull/106058