Not sure it's gdb but AFAIK we fail to extract debuginfo from libxul.so which leads to build failure on Fedora 39/40. Fedora 37/38 is okay. See https://kojipkgs.fedoraproject.org//work/tasks/959/106910959/build.log debugedit: Warning, not replacing comp_dir '/builddir/build/BUILD/firefox-118.0.1/media/ffvpx/libavutil/x86/' prefix ('/builddir/build/BUILD/firefox-118.0.1' -> '/usr/src/debug/firefox-118.0.1-2.fc39.x86_64') encoded as DW_FORM_string. Replacement too large. /usr/bin/gdb-add-index: line 159: 60906 Killed $GDB --batch -nx -iex 'set auto-load no' -iex 'set debuginfod enabled off' -ex "file $file" -ex "save gdb-index $dwarf5 $dir" gdb-add-index: gdb error generating index for /builddir/build/BUILDROOT/firefox-118.0.1-2.fc39.x86_64/usr/lib64/firefox/libxul.so DWARF-compressing 17 files dwz: ./usr/lib64/firefox/libmozavutil.so-118.0.1-2.fc39.x86_64.debug: Unknown DWARF DW_OP_0 referenced from DIE at [3e2eb] dwz: ./usr/lib64/firefox/libxul.so-118.0.1-2.fc39.x86_64.debug: Too many DIEs, not optimizing dwz: ./usr/lib64/firefox/libmozavutil.so-118.0.1-2.fc39.x86_64.debug: Unknown DWARF DW_OP_0 referenced from DIE at [3e2eb] sepdebugcrcfix: Updated 16 CRC32s, 1 CRC32s did match. Creating .debug symlinks for symlinks to ELF files Copying sources found by 'debugedit -l' to /usr/src/debug/firefox-118.0.1-2.fc39.x86_64 Reproducible: Always
This bug looks similar to https://bugzilla.redhat.com/show_bug.cgi?id=1773651. I have a fix (of sorts) which is now in upstream GDB, but I had not backported it to Fedora. For this bug though, it looks like the fix is urgently needed for Fedora 39 and 40. Regarding the fix: Due to the design of the .gdb-index section format, it's not always possible to make an index section for large shared libraries. At the moment, GDB will crash due to an internal error caused by a failed assert. My fix causes GDB to not crash, but to instead throw an error. In both cases (i.e. both with and without the fix), the .gdb-index section is NOT created, but my fix causes GDB to exit gracefully. It will also not leave behind the large temporary file, which appears to be the actual cause for the failed firefox build. I will do a backport of the upstream commit for Fedora 39 and 40.
Right now I see the failure on Fedora 38 too: https://kojipkgs.fedoraproject.org//work/tasks/7943/106977943/build.log
After some more investigation, I can report the following: 1) I'm not able to reproduce the problem with my own (local) mock build. I did: fedpkg co firefox cd firefox fedpkg mockbuild 2) Examination of build.log from the Description and Comment 2 are different than what I saw when investigating Bug 1773651. When I did builds for Bug 1773651, I would see a gdb internal error along with a backtrace. I don't see that for this bug. Modulo the PID, both logs contain the line: /usr/bin/gdb-add-index: line 159: 60906 Killed $GDB --batch -nx -iex 'set auto-load no' -iex 'set debuginfod enabled off' -ex "file $file" -ex "save gdb-index $dwarf5 $dir" I don't see any indication about why the process in question was killed. Do we know the amount of RAM allocated to the build machines? (I allocated 48 GiB to my build machine.) (I'm still going to do a backport of the fix for Bug 1773651, but am no longer convinced that it'll help.)
I've backported upstream GDB commit 98f6baad7c3 for rawhide. Here's a link to the koji build: https://koji.fedoraproject.org/koji/taskinfo?taskID=107006337 Bodhi link: https://bodhi.fedoraproject.org/updates/FEDORA-2023-3ba3a5202c Let me know if it works for rawhide - if it does, I'll do a backport for Fedora 38 and Fedora 39.
(In reply to Kevin Buettner from comment #4) > I've backported upstream GDB commit 98f6baad7c3 for rawhide. Here's a link > to the koji build: > > https://koji.fedoraproject.org/koji/taskinfo?taskID=107006337 > > Bodhi link: > > https://bodhi.fedoraproject.org/updates/FEDORA-2023-3ba3a5202c > > Let me know if it works for rawhide - if it does, I'll do a backport for > Fedora 38 and Fedora 39. Seems to be ok, Firefox build for f40 is done: https://koji.fedoraproject.org/koji/buildinfo?buildID=2299639 Thanks!
(In reply to Kevin Buettner from comment #3) > I don't see any indication about why the process in question was killed. > > Do we know the amount of RAM allocated to the build machines? (I allocated > 48 GiB to my build machine.) That's a good point. If the backport doesn't help (but it looks working) I'll try to fiddle with memreqs for koji builds. We've seen linker failures due OOM recently so it may be related. Thanks.
(In reply to Martin Stransky from comment #6) > (In reply to Kevin Buettner from comment #3) > > I don't see any indication about why the process in question was killed. > > > > Do we know the amount of RAM allocated to the build machines? (I allocated > > 48 GiB to my build machine.) > > That's a good point. If the backport doesn't help (but it looks working) > I'll try to fiddle with memreqs for koji builds. > We've seen linker failures due OOM recently so it may be related. > Thanks. I've attempted a number of firefox builds in the past day. Some were successful, some weren't. Each time, I varied the amount of RAM allocated to the VM doing the build. For this testing, I disabled swapping via "swapoff -a". 32 cores were allocated to the VM. Here's what I found: 8 GiB: failed 10 GiB: failed 12 GiB: failed - died while attempting to link libxul.so 16 GiB: failed - "error: could not compile `gkrust` (lib)" 24 GiB: failed - gdb-add-index was killed while attempting to generate index for libxul.so 32 GiB: success 48 GiB: success Given these results, I don't think the backport helped with this problem. I think it's simply a resource (RAM) problem with the build machines.
After talking it over with nirik, given the urgency of the CVE fix in 118.0.1, we've gone ahead and merged Kalev Lember's suggestion to try and limit the number of CPU cores used by find-debuginfo to one per 32G of memory: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/message/7CX676QRC2QVZATX34WTCF2GL26AAWCL/ and I'm running new attempts at F38 and F39 builds now.
Unfortunately even with that change I've had a couple of failed builds on x86_64. Most recent was a case of "gdb-add-index was killed while attempting to generate index for libxul.so", on buildvm-x86-11.iad2.fedoraproject.org .
Is this still a problem? From reading the comments it sounds like this might be a machine resource issue, but as nothing has been posted in the last year, does this mean the issue has been resolved or worked around? Is there still any GDB work needed? Or should we close this bug, a new bug can always be opened in the future if the issue reappears.
Fedora Linux 39 entered end-of-life (EOL) status on 2024-11-26. Fedora Linux 39 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora Linux please feel free to reopen this bug against that version. Note that the version field may be hidden. Click the "Show advanced fields" button if you do not see the version field. If you are unable to reopen this bug, please file a new report against an active release. Thank you for reporting this bug and we are sorry it could not be fixed.