Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 2188064

Summary: elfutils: eu-elfcompress now breaks hard links
Product: Red Hat Enterprise Linux 9 Reporter: Jan Grulich <jgrulich>
Component: elfutilsAssignee: Mark Wielaard <mjw>
elfutils sub component: system-version QA Contact: Martin Cermak <mcermak>
Status: CLOSED ERRATA Docs Contact:
Severity: unspecified    
Priority: unspecified CC: fweimer, mcermak, mjw, mprchlik, ohudlick, sipoyare, tpelka
Version: 9.3Keywords: Regression, Triaged
Target Milestone: rcFlags: pm-rhel: mirror+
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: elfutils-0.189-2.el9 Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of:
: 2190006 (view as bug list) Environment:
Last Closed: 2023-11-07 08:51:58 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2190006    

Description Jan Grulich 2023-04-19 16:08:11 UTC
I just rebased Qt modules in RHEL 9.3 and got rpminspect issue reporting issues like:

> /usr/lib64/qt5/bin/xmlpatterns in qt5-qtxmlpatterns-devel-5.15.9-1.el9 on aarch64 contains debugging symbols
> Contains: .symtab

This is in build https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=2469566.

It happens for files we create hard link to and it ends up not extracting debug info in the original file. It looks like hard links are not preserved in this case, but when I do a build where I disable debuginfo extraction, then nothing like this happens, see:

>[jgrulich@toolbox qt5-qtxmlpatterns-devel-5.15.9-1.el9.x86_64-nodebug]$ find . -type f -links +1
>./usr/lib64/qt5/bin/xmlpatternsvalidator
>./usr/lib64/qt5/bin/xmlpatterns
>./usr/bin/xmlpatterns-qt5
>./usr/bin/xmlpatternsvalidator-qt5

This was not issue in older build. Same issue happens with some other Qt modules so it's not specific to this package.

Comment 1 Jan Grulich 2023-04-19 16:12:55 UTC
Note: I'm not sure the issue is in binutils, but we discussed this issue with @fweimer and it's just one of potential packages where this issue might be. I tried to downgrade "file", "rpm" and "glibc" before in order to see if the issue is there, but I could still reproduce in local builds.

Comment 2 Nick Clifton 2023-04-20 12:06:22 UTC
I am clutching at straws here, but I think that the culprit might be the elfutils package, or even the redhat-rpm-macros package, rather than the binutils.  The reason for saying this is that the extraction of debug information from a binary (including its .symtab section) is handled by the find-debuginfo.sh script which is part of redhat-rpm-macros, and this script uses the eu-strip program from the elfutils package to strip binaries, rather than the strip program from the binutils package.

Looking at the build.log files for the 5.15.3 and 5.15.9 builds I see some differences in the command lines used to invoke find-debuginfo.sh:

/usr/lib/rpm/find-debuginfo.sh -j16 --strict-build-id -m -i --build-id-seed 5.15.3-1.el9 --unique-debug-suffix -5.15.3-1.el9.x86_64 --unique-debug-src-base qt5-qtxmlpatterns-5.15.3-1.el9.x86_64 --run-dwz --dwz-low-mem-die-limit 10000000 --dwz-max-die-limit 110000000 -S debugsourcefiles.list /builddir/build/BUILD/qtxmlpatterns-everywhere-src-5.15.3

/usr/lib/rpm/find-debuginfo.sh -j64 --strict-build-id -m -i --build-id-seed 5.15.9-1.el9 --unique-debug-suffix -5.15.9-1.el9.x86_64 --unique-debug-src-base qt5-qtxmlpatterns-5.15.9-1.el9.x86_64 --run-dwz --dwz-low-mem-die-limit 10000000 --dwz-max-die-limit 110000000 --remove-section .gnu.build.attributes -S debugsourcefiles.list /builddir/build/BUILD/qtxmlpatterns-everywhere-src-5.15.9

The -j option makes me wonder - could this be a race condition issue ?  If we used to run 16 debug extraction jobs in parallel and we are now running 64, could there be a greater chance for the script to attempt to extract debuginfo from the same file at the same time, via the fact that the two files are hard linked ?  ie could this be a latent bug that has always been there, but is now triggering more reliably because we are running more jobs in parallel ?

Jan - quick (ish) question: Does adding:

 %define _find_debuginfo_opts "-q1"

to the spec file result in binaries that are properly stripped ?

If not, then I will have to ponder some more.

Comment 3 Nick Clifton 2023-04-20 12:23:38 UTC
Sorry make that:

%define _find_debuginfo_opts "-j1"

ie changing the number of parallel jobs to 1.

Comment 4 Jan Grulich 2023-04-20 16:13:42 UTC
(In reply to Nick Clifton from comment #3)
> Sorry make that:
> 
> %define _find_debuginfo_opts "-j1"
> 
> ie changing the number of parallel jobs to 1.

I tried a scratch build, but it doesn't make any difference. Still same issue :-/.

Comment 5 Jan Grulich 2023-04-20 16:16:10 UTC
Actually, I can see in the build log that it still uses "/usr/lib/rpm/find-debuginfo.sh -j64 ..." so it didn't change anything.

Comment 6 Jan Grulich 2023-04-21 08:52:18 UTC
Anyway, I tried it also locally, by default it uses "-j16" and it still happens. I don't know where this is invoked from, but I changed the "find-debuginfo.sh" script to use "-j=1" and it also didn't make any difference. I also think that if it would be a random concurrent issue, it would not have happened in other packages, but I can see same issue two other Qt modules.

Comment 8 Florian Weimer 2023-04-21 11:39:34 UTC
Installing elfutils-0.189-1.el9 into the 9.2 buildroot reproduces the issue.

Comment 9 Florian Weimer 2023-04-21 11:45:03 UTC
I looked at the upstream commit history, and I do not see yet what is causing this. 8-(

Comment 10 Mark Wielaard 2023-04-21 12:02:17 UTC
(In reply to Florian Weimer from comment #9)
> I looked at the upstream commit history, and I do not see yet what is
> causing this. 8-(

Hard links are tricky, there is some special code in find-debuginfo for it.

One issue here might be that find-debuginfo and friends were moved into their own upstream debugedit, which is packaged and shipped with rhel9. rpmbuild in fedora uses that, but not in rhel9. See https://bugzilla.redhat.com/show_bug.cgi?id=2166383

So does this happen only with 9.3? Does it happen in Fedora?

Comment 11 Florian Weimer 2023-04-21 12:03:35 UTC
<mock-chroot> sh-5.1# cp ./src/xz/.libs/xz xz-1
<mock-chroot> sh-5.1# ln xz-1 xz-2
<mock-chroot> sh-5.1# ls -li xz-1 xz-2
183413389 -rwxr-xr-x. 2 root root 349440 Apr 21 13:59 xz-1
183413389 -rwxr-xr-x. 2 root root 349440 Apr 21 13:59 xz-2
<mock-chroot> sh-5.1# eu-elfcompress -q -p -t none xz-2
<mock-chroot> sh-5.1# ls -li xz-1 xz-2
183413389 -rwxr-xr-x. 1 root root 349440 Apr 21 13:59 xz-1
183413392 -rwxr-xr-x. 1 root root 349440 Apr 21 13:59 xz-2
<mock-chroot> sh-5.1# rpm -q elfutils
elfutils-0.189-1.el9.x86_64

I suspect it's either

commit 6bb3e0b5c2124d51c604ec0cf145419c6856f5c0
Author: Martin Liska <mliska>
Date:   Mon Nov 28 14:10:36 2022 +0100

    Refactor elf_compare

or:

commit a5b07cdf9c491fb7a4a16598c482c68b718f59b9
Author: Martin Liska <mliska>
Date:   Tue Nov 29 10:59:30 2022 +0100

    support ZSTD compression algorithm

The first one should say “elfcompress”, not “elfcompare”.

Comment 12 Mark Wielaard 2023-04-21 12:19:13 UTC
Sorry our comments crossed. And I now see Florian already tracked it down to an elfutils 0.188 -> 0.189 change with eu-elfcompress.

This is slightly unfortunate because the eu-elfcompress -t none invocation is kind of unnecessary and a local fedora/rhel tweak.

I don't fully understand yet how it happened though, will investigate/bisect.

Comment 14 Florian Weimer 2023-04-22 18:29:07 UTC
(In reply to Mark Wielaard from comment #13)
> Found it:
> https://patchwork.sourceware.org/project/elfutils/patch/20230421234543.
> 1052146-1-mark/

Aha, so eu-elfcompress breaks hard links because it makes a spurious change to the file that is not really needed? Does this mean we still break hard links after the fix is in if we actually need to uncompress something?

Comment 15 Mark Wielaard 2023-04-22 19:26:43 UTC
(In reply to Florian Weimer from comment #14)
> (In reply to Mark Wielaard from comment #13)
> > Found it:
> > https://patchwork.sourceware.org/project/elfutils/patch/20230421234543.
> > 1052146-1-mark/
> 
> Aha, so eu-elfcompress breaks hard links because it makes a spurious change
> to the file that is not really needed? Does this mean we still break hard
> links after the fix is in if we actually need to uncompress something?

eu-elfcompress doesn't change the file in-place. It first writes any changes to a new file, then when done (and no errors) moves it back. So yes, if a file would actually contain compressed debug ELF sections, then it would be come a new (unlinked) file.

Note that in practice this never happens. Calling eu-elfcompress before find-debuginfo seems to be a Fedora/RHEL specific thing because of a bug in the golang toolchain which did create compressed debug sections (which breaks other tools like debugedit and dwz).

Comment 16 Martin Cermak 2023-04-25 08:46:33 UTC
Verified by rebuilding qt5-qtxmlpatterns-5.15.3-1.el9.src.rpm locally with old (elfutils-0.189-1.el9) and new (elfutils-0.189-2.el9).  After the rpmbuild --rebuild, I've unpacked qt5-qtxmlpatterns-devel with rpm2cpio and checked with file:

9 x86_64 # find . -type f | grep  'usr/lib64/qt5/bin/xmlpatterns$' | xargs file
./rpmbuild_old/RPMS/x86_64/usr/lib64/qt5/bin/xmlpatterns: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=73a24c426185f0161fb611c84564a86b6253e4cb, for GNU/Linux 3.2.0, with debug_info, not stripped, too many notes (256)
./rpmbuild_new/RPMS/x86_64/usr/lib64/qt5/bin/xmlpatterns: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=6f402917827d9ee9da16ae2aebc0f13b5ff5287a, for GNU/Linux 3.2.0, stripped
9 x86_64 # ls

Comment 22 errata-xmlrpc 2023-11-07 08:51:58 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (elfutils bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:6609