Bug 2237392 - Failed assertion on i386 when extracting debuginfo
Summary: Failed assertion on i386 when extracting debuginfo
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: gdb
Version: rawhide
Hardware: i386
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Kevin Buettner
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 2238268 2238843 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-09-05 09:00 UTC by Frantisek Sumsal
Modified: 2023-09-20 00:19 UTC (History)
23 users (show)

Fixed In Version: gdb-13.2-4.fc38 gdb-13.2-8.fc39
Clone Of:
Environment:
Last Closed: 2023-09-18 18:07:32 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
reproducer executable (37.76 KB, application/x-xz)
2023-09-13 01:10 UTC, Keith Seitz
no flags Details

Description Frantisek Sumsal 2023-09-05 09:00:11 UTC
Hey,

Since ~yesterday I started noticing hanging i386 Rawhide Copr jobs in our upstream systemd CI. After closer inspection there seems to be a failed assert somewhere in glibc which the jobs get stuck on:

+ /usr/lib/rpm/find-lang.sh /builddir/build/BUILDROOT/systemd-254-1.20230905052623510728.pr29071.899.g0d239b6a0a.i386 systemd
+ python3 /builddir/build/SOURCES/split-files.py /builddir/build/BUILDROOT/systemd-254-1.20230905052623510728.pr29071.899.g0d239b6a0a.i386
+ /usr/bin/find-debuginfo -j2 --strict-build-id -m -i --build-id-seed 254-1.20230905052623510728.pr29071.899.g0d239b6a0a --unique-debug-suffix -254-1.20230905052623510728.pr29071.899.g0d239b6a0a.i386 --unique-debug-src-base systemd-254-1.20230905052623510728.pr29071.899.g0d239b6a0a.i386 --run-dwz --dwz-low-mem-die-limit 10000000 --dwz-max-die-limit 50000000 -S debugsourcefiles.list /builddir/build/BUILD/systemd-254
find-debuginfo: starting
Extracting debug info from 457 files
Fatal glibc error: malloc.c:2594 (sysmalloc): assertion failed: (old_top == initial_top (av) && old_size == 0) || ((unsigned long) (old_size) >= MINSIZE && prev_inuse (old_top) && ((unsigned long) old_end & (pagesize - 1)) == 0)

A couple of affected jobs:
 - https://copr.fedorainfracloud.org/coprs/packit/systemd-systemd-29071/build/6372309/
 - https://copr.fedorainfracloud.org/coprs/packit/systemd-systemd-29074/build/6372581/
 - https://copr.fedorainfracloud.org/coprs/packit/systemd-systemd-29051/build/6372504/ 

Reproducible: Always

Comment 1 Florian Weimer 2023-09-05 09:15:11 UTC
I can reproduce it in mock:

“
+ /usr/bin/find-debuginfo -j4 --strict-build-id -m -i --build-id-seed 254.1-5.fc40 --unique-debug-suffix -254.1-5.fc40.i386 --unique-debug-src-base systemd-254.1-5.fc40.i386 --run-dwz --dwz-low-mem-die-limit 10000000 --dwz-max-die-limit 50000000 -S debugsourcefiles.list /builddir/build/BUILD/systemd-stable-254.1
find-debuginfo: starting
Extracting debug info from 451 files
Fatal glibc error: malloc.c:2594 (sysmalloc): assertion failed: (old_top == initial_top (av) && old_size == 0) || ((unsigned long) (old_size) >= MINSIZE && prev_inuse (old_top) && ((unsigned long) old_end & (pagesize - 1)) == 0)


Fatal signal: 
”

Comment 2 Florian Weimer 2023-09-05 09:17:37 UTC
This works as a reproducer without a full rebuild:

cd /builddir/build/BUILD/systemd-stable-254.1
RPM_PACKAGE_NAME=systemd RPM_BUILD_DIR=`pwd` RPM_BUILD_ROOT=/builddir/build/BUILDROOT/systemd-254.1-5.fc40.i386 /usr/bin/find-debuginfo -j4 --strict-build-id -m -i --build-id-seed 254.1-5.fc40 --unique-debug-suffix -254.1-5.fc40.i386 --unique-debug-src-base systemd-254.1-5.fc40.i386 --run-dwz --dwz-low-mem-die-limit 10000000 --dwz-max-die-limit 50000000 -S debugsourcefiles.list /builddir/build/BUILD/systemd-stable-254.1

It likely hangs because the crash handler calls into malloc.

Comment 3 Florian Weimer 2023-09-05 09:26:16 UTC
Reproduces with glibc-2.38.9000-5.fc40.i686 as well.

Comment 4 Florian Weimer 2023-09-05 09:31:48 UTC
And even glibc-2.38-1.fc39.i686. Not sure if this is actually a glibc bug.

Comment 5 Florian Weimer 2023-09-05 09:33:14 UTC
Running under bash -x produces:

“
+ gdb-add-index /builddir/build/BUILDROOT/systemd-254.1-5.fc40.i386/usr/lib/systemd/tests/unit-tests/test-tpm2
Fatal glibc error: malloc.c:2589 (sysmalloc): assertion failed: (old_top == initial_top (av) && old_size == 0) || ((unsigned long) (old_size) >= MINSIZE && prev_inuse (old_top) && ((unsigned long) old_end & (pagesize - 1)) == 0)


Fatal signal: ^C
/usr/bin/find-debuginfo: line 457:   221 Killed                  gdb-add-index "$f"
“

So I suspect this is a gdb issue.

Comment 6 Zbigniew Jędrzejewski-Szmek 2023-09-07 19:05:33 UTC
It only happens in rawhide builds. The same package built in F39 is fine.
https://koji.fedoraproject.org/koji/taskinfo?taskID=105853466 → bad
https://koji.fedoraproject.org/koji/taskinfo?taskID=105853461 → no problem

Comment 7 Andrew Burgess 2023-09-12 15:47:42 UTC
Would it be possible to attach the /builddir/build/BUILDROOT/systemd-254.1-5.fc40.i386/usr/lib/systemd/tests/unit-tests/test-tpm2 for which gdb-add-index fails to this bug?

Comment 8 Keith Seitz 2023-09-13 01:10:50 UTC
Created attachment 1988548 [details]
reproducer executable

Comment 9 Keith Seitz 2023-09-13 01:17:06 UTC
I've spent the day chasing this down a bit...

First off, you *must* use gdb.i686 to reproduce this with the (attached)
binary. Either use mock to build it, creating a rawhide/i386 env,
or grab the RPMs from koji and install the necessary 32-bit dependencies
on your workstation.  [I did this successfully on f38.]

While playing around in my mock environment, I noticed that upstream
origin/master worked. This is the commit that fixes it:

commit d06730bc0205f7c35bfccf057ef0ef83a12206d6
Author: Tom de Vries <tdevries>
Date:   Sat Aug 5 17:57:13 2023 +0200

    [gdb/symtab] Find main language without symtab expansion

However, simply grabbing this patch is insufficient. AFAICT, it requires
at least a dozen other patches -- gdb/dwarf2/cooked_index.[ch] have
changed a LOT since gdb-13-branch.

And I still don't know why this only fails on rawhide/i386...

Comment 10 Carlos O'Donell 2023-09-14 10:55:03 UTC
*** Bug 2238843 has been marked as a duplicate of this bug. ***

Comment 11 Andrew Burgess 2023-09-14 12:24:34 UTC
I believe I have a fix for this issue.  I'm running the GDB regression tests and will post the patch upstream later today.  Hopefully we can get the fix merged ASAP and then back-ported.  Just for the record, here's the GDB patch to fix this (yes, it's a 2 character change):

diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index 98bedbc5d49..0f4d99109fb 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -10548,7 +10548,7 @@ read_call_site_scope (struct die_info *die, struct dwarf2_cu *cu)
 	  std::vector<unrelocated_addr> addresses;
 	  dwarf2_ranges_read_low_addrs (ranges_offset, target_cu,
 					target_die->tag, addresses);
-	  unrelocated_addr *saved = XOBNEWVAR (&objfile->objfile_obstack,
+	  unrelocated_addr *saved = XOBNEWVEC (&objfile->objfile_obstack,
 					       unrelocated_addr,
 					       addresses.size ());
 	  std::copy (addresses.begin (), addresses.end (), saved);

Comment 12 Andrew Burgess 2023-09-14 21:31:24 UTC
Created pull requests https://src.fedoraproject.org/rpms/gdb/pull-request/96 and https://src.fedoraproject.org/rpms/gdb/pull-request/97 to back-port the fix to rawhide and f38 respectively.

Comment 13 Fabio Valentini 2023-09-15 13:49:35 UTC
Out of curiosity, is there a reason why this patch was back ported to Fedora 40/rawhide and Fedora 38, but not for the upcoming Fedora 39 release?

Comment 14 Keith Seitz 2023-09-15 15:10:23 UTC
Someone missed that rawhide had moved to 40. There is a PR working its way through now.

Comment 15 Fedora Update System 2023-09-15 20:44:03 UTC
FEDORA-2023-e55ab8d0a7 has been submitted as an update to Fedora 39. https://bodhi.fedoraproject.org/updates/FEDORA-2023-e55ab8d0a7

Comment 16 Fedora Update System 2023-09-15 20:44:54 UTC
FEDORA-2023-15aed01c68 has been submitted as an update to Fedora 38. https://bodhi.fedoraproject.org/updates/FEDORA-2023-15aed01c68

Comment 17 Fedora Update System 2023-09-16 01:48:42 UTC
FEDORA-2023-e55ab8d0a7 has been pushed to the Fedora 39 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2023-e55ab8d0a7`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2023-e55ab8d0a7

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 18 Fedora Update System 2023-09-16 03:17:57 UTC
FEDORA-2023-15aed01c68 has been pushed to the Fedora 38 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2023-15aed01c68`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2023-15aed01c68

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 19 Mark Wielaard 2023-09-18 13:42:50 UTC
*** Bug 2238268 has been marked as a duplicate of this bug. ***

Comment 20 Fedora Update System 2023-09-18 18:07:32 UTC
FEDORA-2023-15aed01c68 has been pushed to the Fedora 38 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 21 Fedora Update System 2023-09-20 00:19:44 UTC
FEDORA-2023-e55ab8d0a7 has been pushed to the Fedora 39 stable repository.
If problem still persists, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.