Hey, Since ~yesterday I started noticing hanging i386 Rawhide Copr jobs in our upstream systemd CI. After closer inspection there seems to be a failed assert somewhere in glibc which the jobs get stuck on: + /usr/lib/rpm/find-lang.sh /builddir/build/BUILDROOT/systemd-254-1.20230905052623510728.pr29071.899.g0d239b6a0a.i386 systemd + python3 /builddir/build/SOURCES/split-files.py /builddir/build/BUILDROOT/systemd-254-1.20230905052623510728.pr29071.899.g0d239b6a0a.i386 + /usr/bin/find-debuginfo -j2 --strict-build-id -m -i --build-id-seed 254-1.20230905052623510728.pr29071.899.g0d239b6a0a --unique-debug-suffix -254-1.20230905052623510728.pr29071.899.g0d239b6a0a.i386 --unique-debug-src-base systemd-254-1.20230905052623510728.pr29071.899.g0d239b6a0a.i386 --run-dwz --dwz-low-mem-die-limit 10000000 --dwz-max-die-limit 50000000 -S debugsourcefiles.list /builddir/build/BUILD/systemd-254 find-debuginfo: starting Extracting debug info from 457 files Fatal glibc error: malloc.c:2594 (sysmalloc): assertion failed: (old_top == initial_top (av) && old_size == 0) || ((unsigned long) (old_size) >= MINSIZE && prev_inuse (old_top) && ((unsigned long) old_end & (pagesize - 1)) == 0) A couple of affected jobs: - https://copr.fedorainfracloud.org/coprs/packit/systemd-systemd-29071/build/6372309/ - https://copr.fedorainfracloud.org/coprs/packit/systemd-systemd-29074/build/6372581/ - https://copr.fedorainfracloud.org/coprs/packit/systemd-systemd-29051/build/6372504/ Reproducible: Always
I can reproduce it in mock: “ + /usr/bin/find-debuginfo -j4 --strict-build-id -m -i --build-id-seed 254.1-5.fc40 --unique-debug-suffix -254.1-5.fc40.i386 --unique-debug-src-base systemd-254.1-5.fc40.i386 --run-dwz --dwz-low-mem-die-limit 10000000 --dwz-max-die-limit 50000000 -S debugsourcefiles.list /builddir/build/BUILD/systemd-stable-254.1 find-debuginfo: starting Extracting debug info from 451 files Fatal glibc error: malloc.c:2594 (sysmalloc): assertion failed: (old_top == initial_top (av) && old_size == 0) || ((unsigned long) (old_size) >= MINSIZE && prev_inuse (old_top) && ((unsigned long) old_end & (pagesize - 1)) == 0) Fatal signal: ”
This works as a reproducer without a full rebuild: cd /builddir/build/BUILD/systemd-stable-254.1 RPM_PACKAGE_NAME=systemd RPM_BUILD_DIR=`pwd` RPM_BUILD_ROOT=/builddir/build/BUILDROOT/systemd-254.1-5.fc40.i386 /usr/bin/find-debuginfo -j4 --strict-build-id -m -i --build-id-seed 254.1-5.fc40 --unique-debug-suffix -254.1-5.fc40.i386 --unique-debug-src-base systemd-254.1-5.fc40.i386 --run-dwz --dwz-low-mem-die-limit 10000000 --dwz-max-die-limit 50000000 -S debugsourcefiles.list /builddir/build/BUILD/systemd-stable-254.1 It likely hangs because the crash handler calls into malloc.
Reproduces with glibc-2.38.9000-5.fc40.i686 as well.
And even glibc-2.38-1.fc39.i686. Not sure if this is actually a glibc bug.
Running under bash -x produces: “ + gdb-add-index /builddir/build/BUILDROOT/systemd-254.1-5.fc40.i386/usr/lib/systemd/tests/unit-tests/test-tpm2 Fatal glibc error: malloc.c:2589 (sysmalloc): assertion failed: (old_top == initial_top (av) && old_size == 0) || ((unsigned long) (old_size) >= MINSIZE && prev_inuse (old_top) && ((unsigned long) old_end & (pagesize - 1)) == 0) Fatal signal: ^C /usr/bin/find-debuginfo: line 457: 221 Killed gdb-add-index "$f" “ So I suspect this is a gdb issue.
It only happens in rawhide builds. The same package built in F39 is fine. https://koji.fedoraproject.org/koji/taskinfo?taskID=105853466 → bad https://koji.fedoraproject.org/koji/taskinfo?taskID=105853461 → no problem
Would it be possible to attach the /builddir/build/BUILDROOT/systemd-254.1-5.fc40.i386/usr/lib/systemd/tests/unit-tests/test-tpm2 for which gdb-add-index fails to this bug?
Created attachment 1988548 [details] reproducer executable
I've spent the day chasing this down a bit... First off, you *must* use gdb.i686 to reproduce this with the (attached) binary. Either use mock to build it, creating a rawhide/i386 env, or grab the RPMs from koji and install the necessary 32-bit dependencies on your workstation. [I did this successfully on f38.] While playing around in my mock environment, I noticed that upstream origin/master worked. This is the commit that fixes it: commit d06730bc0205f7c35bfccf057ef0ef83a12206d6 Author: Tom de Vries <tdevries> Date: Sat Aug 5 17:57:13 2023 +0200 [gdb/symtab] Find main language without symtab expansion However, simply grabbing this patch is insufficient. AFAICT, it requires at least a dozen other patches -- gdb/dwarf2/cooked_index.[ch] have changed a LOT since gdb-13-branch. And I still don't know why this only fails on rawhide/i386...
*** Bug 2238843 has been marked as a duplicate of this bug. ***
I believe I have a fix for this issue. I'm running the GDB regression tests and will post the patch upstream later today. Hopefully we can get the fix merged ASAP and then back-ported. Just for the record, here's the GDB patch to fix this (yes, it's a 2 character change): diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c index 98bedbc5d49..0f4d99109fb 100644 --- a/gdb/dwarf2/read.c +++ b/gdb/dwarf2/read.c @@ -10548,7 +10548,7 @@ read_call_site_scope (struct die_info *die, struct dwarf2_cu *cu) std::vector<unrelocated_addr> addresses; dwarf2_ranges_read_low_addrs (ranges_offset, target_cu, target_die->tag, addresses); - unrelocated_addr *saved = XOBNEWVAR (&objfile->objfile_obstack, + unrelocated_addr *saved = XOBNEWVEC (&objfile->objfile_obstack, unrelocated_addr, addresses.size ()); std::copy (addresses.begin (), addresses.end (), saved);
Created pull requests https://src.fedoraproject.org/rpms/gdb/pull-request/96 and https://src.fedoraproject.org/rpms/gdb/pull-request/97 to back-port the fix to rawhide and f38 respectively.
Out of curiosity, is there a reason why this patch was back ported to Fedora 40/rawhide and Fedora 38, but not for the upcoming Fedora 39 release?
Someone missed that rawhide had moved to 40. There is a PR working its way through now.
FEDORA-2023-e55ab8d0a7 has been submitted as an update to Fedora 39. https://bodhi.fedoraproject.org/updates/FEDORA-2023-e55ab8d0a7
FEDORA-2023-15aed01c68 has been submitted as an update to Fedora 38. https://bodhi.fedoraproject.org/updates/FEDORA-2023-15aed01c68
FEDORA-2023-e55ab8d0a7 has been pushed to the Fedora 39 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2023-e55ab8d0a7` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2023-e55ab8d0a7 See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
FEDORA-2023-15aed01c68 has been pushed to the Fedora 38 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2023-15aed01c68` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2023-15aed01c68 See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
*** Bug 2238268 has been marked as a duplicate of this bug. ***
FEDORA-2023-15aed01c68 has been pushed to the Fedora 38 stable repository. If problem still persists, please make note of it in this bug report.
FEDORA-2023-e55ab8d0a7 has been pushed to the Fedora 39 stable repository. If problem still persists, please make note of it in this bug report.