Description of problem: Today I tried build cabal-rpm-1.0 on Rawhide in Koji and it fails on s390x when stripping the binary executable... Version-Release number of selected component (if applicable): binutils-2.33.1-3.fc32 ghc-8.6.5 cabal-rpm-1.0.1 How reproducible: 100% Steps to Reproduce: $ fedpkg clone cabal-rpm $ cd cabal-rpm $ fedpkg build --scratch --arch s390x Actual results: https://koji.fedoraproject.org/koji/taskinfo?taskID=38711104 /usr/bin/strip:/builddir/build/BUILDROOT/cabal-rpm-1.0.1-1.fc32.s390x/usr/bin/cabal-rpm[.gnu.build.attributes]: corrupt GNU build attribute note: wrong note type: bad value Expected results: No error as for F31. Additional info: I haven't actually checked carefully that this is due to binutils, but given it doesn't happen on F31 and only on F32 s390x, it seems suspicious at least. Any ideas what the problem could be?
(In reply to Jens Petersen from comment #0) > /usr/bin/strip:/builddir/build/BUILDROOT/cabal-rpm-1.0.1-1.fc32.s390x/usr/ > bin/cabal-rpm[.gnu.build.attributes]: corrupt GNU build attribute note: > wrong note type: bad value > I haven't actually checked carefully that this is due to binutils, It is... > but given > it doesn't happen on F31 and only on F32 s390x, it seems suspicious at least. > Any ideas what the problem could be? I recently updated the note merging code in the objcopy program which is used by rpmbuild when it is creating separate debug info files. This error is coming from that code, so I am investigating... Cheers Nick
Hi Jens, Are you able to capture a copy of the usr/bin/cabal-rpm[ binary that is triggering this problem ? I have been trying to get an s390 machine out of beaker all day and have had no luck. Without a copy of that file I cannot reproduce the problem, and that means that I will not be able to fix the bug. Cheers Nick
Hi Nick Finally I realised not just the rpmbuild macros but Haskell Cabal also strips the executable apparently when installing. Can you try extracting from https://koji.fedoraproject.org/koji/taskinfo?taskID=38763072. /usr/bin/cabal-rpm should be an unstripped pristine built executable. Thanks!
(I also noticed on F31 that: $ file /usr/bin/cabal-rpm /usr/bin/cabal-rpm: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, BuildID[sha1]=77417ce856fb91ee3bd8b4644837c3b6abb2687d, stripped, too many notes (256) That's just for the record - I don't think it relates to this bug, but it made me raise my eyebrows.)
(In reply to Jens Petersen from comment #3) > Finally I realised not just the rpmbuild macros but Haskell Cabal also > strips the executable apparently when installing. Ah - I think that this might be the core of the problem. > Can you try extracting from > https://koji.fedoraproject.org/koji/taskinfo?taskID=38763072. > /usr/bin/cabal-rpm should be an unstripped pristine built executable. Thanks! When you say "unstripped" do you mean "not stripped by the strip program, but stripped by Haskell" ? I ask because as you note, the file program says that it is stripped. > $ file /usr/bin/cabal-rpm > /usr/bin/cabal-rpm: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, > for GNU/Linux 3.2.0, BuildID[sha1]=77417ce856fb91ee3bd8b4644837c3b6abb2687d, stripped, too many notes (256) (The complaint about there being too many notes is actually a bug in the file program. It is not expecting to find a binary with so many notes in it, but these days this happens quite a lot). If you examine that binary with readelf however, a problem emerges: % readelf --wide --notes usr/bin/cabal-rpm > /dev/null [...] readelf: usr/bin/cabal-rpm: Warning: note with invalid namesz and/or descsz found at offset 0xb6cf0 readelf: usr/bin/cabal-rpm: Warning: type: 0x0, namesize: 0x00000000, descsize: 0x801997e8, alignment: 4 This is why the objcopy run during the %install phase is failing - the notes in the cabal-rpm binary are corrupt *before* objcopy runs. My guess as to the cause is this strip step that you mention Haskell is running. But that is just a guess at the moment. I need to investigate some more. Another question then - which object files are linked together to make the cabal-rpm binary ? Judging by the build log it is all the .o files in the dist/build/cabal-rpm/cabal-rpm-tmp directory and its sub-directories. (I am going to have to check each of them to see if one or more contain broken notes). Oh - and one other thing. Are any parts of cabal-rpm not compiled with gcc ? For example are there any assembler pieces, or parts that are extracted from a static library ? Thanks for the help in debugging this problem. :-) Cheers Nick
Hi Jens, How is cabal-rpm linked ? The build log just says "Linking dist/build/cabal-rpm/cabal-rpm ..." but it does not give any clues as to the exact command(s) run. (It is a real shame that this bug is s390x specific. If it happened on any other architecture I could run a build natively and capture the commands myself). I have looked at the individual object files and they appear to be fine - no corrupt notes. So my guess is that either the link process itself is flawed, or else the code is being linked with a library which contains corrupt notes. Cheers Nick
Hi Jens, Well this Haskell build system is a bit of a pain. At least to me, since I am not familiar with it. I built an x86_64 targeted version of cabal-rpm and I was able to capture some of the linker command line. (It is still not the actual real linker command line, but better than the single sentence in the build log). It appears that there is a package database involved with all kinds of binary objects being extracted from it. So my current theory is that one of more or the objects inside the package database contain corrupt annobin notes, but only in the s390x version. I have no idea how to go about rebuilding the database so instead I am going to make a change to the binutils' objcopy program, so that it will no longer generate a failure exit code when it encounters corrupt notes. (The notes will not be merged, so effectively the objcopy will be a no-op). This will allow your build of cabal-rpm to succeed. It is not an ideal solution, bit without a way to track down where/how the notes are being corrupted it is the best that I can do for now. Plus it will allow you to continue with your work. I will update this BZ again once the new binutils are ready. Cheers Nick
Hi Jens, Right, binutils-2.33.1-4.fc32 has the fix. It is built for rawhide, although it may take a day or so to make it into the buildroot. Cheers Nick
Okay, sorry I was out last night and missed your comments... With binutils-2.33.1-4.fc32 it built fine on s390x again, thanks: https://koji.fedoraproject.org/koji/buildinfo?buildID=1406764
(In reply to Nick Clifton from comment #5) > (In reply to Jens Petersen from comment #3) > > > Finally I realised not just the rpmbuild macros but Haskell Cabal also > > strips the executable apparently when installing. > > Ah - I think that this might be the core of the problem. Well Cabal is just using /usr/bin/strip to do that. (But yeah I should probably disable that and leave it to rpmbuild.) > > Can you try extracting from > > https://koji.fedoraproject.org/koji/taskinfo?taskID=38763072. > > /usr/bin/cabal-rpm should be an unstripped pristine built executable. Thanks! > > When you say "unstripped" do you mean "not stripped by the strip program, > but stripped by Haskell" ? In that s390x scratch build I copy the built executable to /usr/bin avoiding Cabal doing any installation. So it should be completely unstripped - I didn't verify this. If that is not the case I can try harder.. > I ask because as you note, the file program says that it is stripped. No, no that side note was for the x86_64 binary just built normally, ie stripped by Cabal and rpmbuild. Sorry for the confusion. > > $ file /usr/bin/cabal-rpm > > /usr/bin/cabal-rpm: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, > > for GNU/Linux 3.2.0, BuildID[sha1]=77417ce856fb91ee3bd8b4644837c3b6abb2687d, stripped, too many notes (256) > > (The complaint about there being too many notes is actually a bug in the > file program. It is not expecting to find a binary with so many notes in > it, but these days this happens quite a lot). Okay > If you examine that binary with readelf however, a problem emerges: > > % readelf --wide --notes usr/bin/cabal-rpm > /dev/null > [...] > readelf: usr/bin/cabal-rpm: Warning: note with invalid namesz and/or > descsz found at offset 0xb6cf0 > readelf: usr/bin/cabal-rpm: Warning: type: 0x0, namesize: 0x00000000, > descsize: 0x801997e8, alignment: 4 > > This is why the objcopy run during the %install phase is failing - the notes > in the cabal-rpm binary are corrupt *before* objcopy runs. Hmm that is worrying > Another question then - which object files are linked together to make the > cabal-rpm binary ? Judging by the build log it is all the .o files in the > dist/build/cabal-rpm/cabal-rpm-tmp directory and its sub-directories. I think that is correct - along with some static Haskell libraries. > Oh - and one other thing. Are any parts of cabal-rpm not compiled with gcc > ? For example are there any assembler pieces, or parts that are extracted > from a static library ? On s390x I believe only gcc is used - no assembly. On Intel and Power archs ghc has a Native Code Generator: gcc is only used for linking. (On ARM llvm is used to generate objects.) https://gitlab.haskell.org/ghc/ghc/wikis/platforms
Hi Nick, (In reply to Nick Clifton from comment #6) > How is cabal-rpm linked ? > > The build log just says "Linking dist/build/cabal-rpm/cabal-rpm ..." but > it does not give any clues as to the exact command(s) run. I can get you the verbose output if it helps. [..] > So my guess is that either the link process itself is flawed, > or else the code is being linked with a library which contains corrupt notes. Yes that is my fear. How to track that down? I guess some script could be run over ghc-*-devel.s390x to check them?
(In reply to Nick Clifton from comment #7) > I built an x86_64 targeted version of cabal-rpm and I was able to capture > some of the linker command line. (It is still not the actual real linker > command line, but better than the single sentence in the build log). It > appears that there is a package database involved with all kinds of binary > objects being extracted from it. Well the package database just relates to Haskell libraries. There is a cli through the ghc-pkg commmand. > So my current theory is that one of more or the objects inside the package > database contain corrupt annobin notes, but only in the s390x version. I > have no idea how to go about rebuilding the database so instead I am going > to make a change to the binutils' objcopy program, so that it will no longer > generate a failure exit code when it encounters corrupt notes. (The notes > will not be merged, so effectively the objcopy will be a no-op). This will > allow your build of cabal-rpm to succeed. It is not an ideal solution, bit > without a way to track down where/how the notes are being corrupted it is > the best that I can do for now. Plus it will allow you to continue with > your work. Thanks What is the impact of this change, then? Will it affect building of Haskell static libraries?
(In reply to Jens Petersen from comment #12) Hi Jens, > What is the impact of this change, then? > Will it affect building of Haskell static libraries? No. It will mean that you can build everything without worrying about the note-merging step that was recently added to rpmbuild. The note-merging will fail in the case of cabal-rpm for s390x, but this will not actually break anything. It will just mean that the cabal-rpm binary will be larger than it could be because it contains lots of redundant notes. (The notes do not affect performance - they are not loaded at run time - they only affect on-disk image size). At some point in the future the problem may go away - eg if a mass rebuild replaces the corrupt notes with good ones - or we may come across another way to reproduce the problem which will lead to a proper bug fix. But in the meantime you can continue your work without worrying about annobin, notes, objcopy or rpmbuild ... :-) There is one caveat to the above statement however: I have only applied my "fix" to objcopy to Fedora (rawhide and 31). I have not applied it to F30 or to RHEL. So if cabal-rpm is going to be built for either of those releases then I will need to fix objcopy there as well. Do you know if this is likely to be needed ? Cheers Nick PS: >> or else the code is being linked with a library which contains corrupt notes. > Yes that is my fear. > How to track that down? I guess some script could be run over ghc-*-devel.s390x to check them? Yes that should be possible. If you have access to where the rpms are stored then a command like this ought to work: find . -name "*.s390x.rpm" -fprint /dev/stderr %f -exec readelf --notes > /dev/null If readelf reports an error about a malformed note then you have located the affected rpm(s).
Hi Jens, Update - it turns out that this was not an s390x specific bug, it affects all architectures. Plus I managed to break building everything. :-( What was happening was that if strip is run on a file containing annobin data and the merge algorithm decided that it was not going to do anything (because say there were relocations that needed to be resolved first, or the merge would not save any space), then it would just drop the note section from the list of sections to be copied into the output. Which meant that the output note section was then filled with zeroes. Which is bad. This was not noticed at first because almost all of the time the merge algorithm does do something. But there is one special case - the crti.o and crtn.o files from glibc. These are linked into almost every compiled executable, but they are object files, so their notes cannot be merged. So once a new version of glibc had been installed into the build root, the process of installing crti.o and crtn.o would have corrupted their notes. Then when another program is linked, eg cabal-rpm, it is linked with the corrupted crti.o and crtn.o and the results are ... bad. So I have fixed the bug in strip and updated the binutils in Fedora (Rawhide, F31 and F30). But now we need to rebuild glibc in these environments. Actually we only need to update Rawhide as the broken versions of binutils have not yet been moved into the build roots of F31 anf F30. Cheers Nick
This problem is still affecting me on f31 on x86_64 and is not limited to building Haskell binaries. For example, $ echo "int main() { return 0; }" > test.c $ gcc -o test test.c $ strip test strip:test[.gnu.build.attributes]: corrupt GNU build attribute note: wrong note type: bad value
(In reply to Truls Asheim from comment #15) > This problem is still affecting me on f31 on x86_64 and is not limited to > building Haskell binaries. For example, > > $ echo "int main() { return 0; }" > test.c > $ gcc -o test test.c > $ strip test > strip:test[.gnu.build.attributes]: corrupt GNU build attribute note: wrong > note type: bad value Reproduced with glibc-2.30-7.fc31.x86_64 and binutils-2.32-26.fc31.x86_64.
Hi Jens, Hi Truls, This is a known problem with the -26.fc31 binutils release. Please try using a later release, eg -29.fc31. Cheers Nick
Thanks, Nick binutils-2.32-29.fc31.x86_64 looks good to me. Will you push the builds to Bodhi?
However with binutils-gold-2.32-29.fc31.x86_64 I can't link Haskell executables now: eg $ cabal unpack yesod-core-1.6.14 $ cabal build : Building library for yesod-core-1.6.14.. /usr/bin/ld.gold: fatal error: /lib64/libm.so.6: ELF section name out of range collect2: error: ld returned 1 exit status `gcc' failed in phase `Linker'. (Exit code: 1) Maybe it is better I open a separate report for that?
This appears to be breaking octave package builds as well: + /usr/lib/rpm/brp-strip-shared /usr/bin/strip /usr/bin/strip:/builddir/build/BUILDROOT/octave-netcdf-1.0.12-7.fc31.x86_64/usr/lib64/octave/packages/netcdf-1.0.12/x86_64-redhat-linux-gnu-api-v53/__netcdf__.oct[.gnu.build.attributes]: corrupt GNU build attribute note: wrong note type: bad value other *possibly* affected packages: https://koschei.fedoraproject.org/affected-by/binutils?epoch1=0&version1=2.32&release1=24.fc31&epoch2=0&version2=2.32&release2=26.fc31&collection=f31
(In reply to Jens Petersen from comment #19) Hi Jens, > However with binutils-gold-2.32-29.fc31.x86_64 > I can't link Haskell executables now: > > eg > $ cabal unpack yesod-core-1.6.14 > $ cabal build > : > Building library for yesod-core-1.6.14.. > /usr/bin/ld.gold: fatal error: /lib64/libm.so.6: ELF section name out of > range > collect2: error: ld returned 1 exit status > `gcc' failed in phase `Linker'. (Exit code: 1) > > Maybe it is better I open a separate report for that? Yes please. I think that this must be a separate problem. Did the link work with earlier versions of the 2.32 series binutils ? eg 2-32-21 ? Releases -22 and -23 contain fixes for GOLD bugs, so maybe these are causing the problem. Also, is yesod-core-1.6.14 easy to obtain ? I will need to run tests locally...
FEDORA-2019-8aa040b253 has been submitted as an update to Fedora 31. https://bodhi.fedoraproject.org/updates/FEDORA-2019-8aa040b253
(In reply to Jens Petersen from comment #18) Hi Jens, > binutils-2.32-29.fc31.x86_64 looks good to me. > Will you push the builds to Bodhi? I thought that I already had, but I must have been dreaming. Bodhi update now requested.
(In reply to Orion Poplawski from comment #20) Hi Oriion, > This appears to be breaking octave package builds as well: > > + /usr/lib/rpm/brp-strip-shared /usr/bin/strip > /usr/bin/strip:/builddir/build/BUILDROOT/octave-netcdf-1.0.12-7.fc31.x86_64/ > usr/lib64/octave/packages/netcdf-1.0.12/x86_64-redhat-linux-gnu-api-v53/ > __netcdf__.oct[.gnu.build.attributes]: corrupt GNU build attribute note: > wrong note type: bad value Yup. It is the same problem and it is the same version of the binutils package (-26) that is causing the problem. > other *possibly* affected packages: > https://koschei.fedoraproject.org/affected-by/binutils?epoch1=0&version1=2. > 32&release1=24.fc31&epoch2=0&version2=2.32&release2=26.fc31&collection=f31 Possibly. I am hoping that the bodhi update will kick in soon and fix this. Cheers Nick
I get this error in Fedora 31 (with all updates installed) while trying to run `make install` for ffmpeg-4.2.1: STRIP ffmpeg strip:ffmpeg_g[.gnu.build.attributes]: corrupt GNU build attribute note: wrong note type: bad value make: *** [Makefile:104: ffmpeg] Error 1 INSTALL libavdevice/libavdevice.so STRIP install-libavdevice-shared strip:/usr/local/lib64/libavdevice.so.58.8.100[.gnu.build.attributes]: corrupt GNU build attribute note: wrong note type: bad value make: *** [ffbuild/library.mak:104: install-libavdevice-shared] Error 1 Is it the same error? Should I file a new bug report?
Well, you left out the critical piece of information: what version of binutils do you have?
(In reply to Orion Poplawski from comment #26) # rpm -q binutils binutils-2.32-26.fc31.x86_64
Right, that's the broken one. The update that is meant to fix this is -29. See comment 22.
Strangely the fixed package is not even in testing yet. Thanks for the info! Will be waiting for it eagerly.
binutils-2.31.1-36.fc30 has been pushed to the Fedora 30 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2019-9a102a5fa8
binutils-2.32-29.fc31 has been pushed to the Fedora 31 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2019-8aa040b253
(In reply to Nick Clifton from comment #21) > (In reply to Jens Petersen from comment #19) > > However with binutils-gold-2.32-29.fc31.x86_64 > > I can't link Haskell executables now: > > > > eg > > $ cabal unpack yesod-core-1.6.14 > > $ cabal build > > : > > Building library for yesod-core-1.6.14.. > > /usr/bin/ld.gold: fatal error: /lib64/libm.so.6: ELF section name out of > > range > > collect2: error: ld returned 1 exit status > > `gcc' failed in phase `Linker'. (Exit code: 1) After reinstalling -29.fc31 I can't reproduce anymore. Maybe my toolbox was in a bad state or something, shrug. > Also, is yesod-core-1.6.14 easy to obtain ? I will need to run tests > locally... (That would be implicit via a "step 0": `dnf install cabal-rpm; cabal-rpm builddep yesod-core` or longer `cabal update`. I just picked the version in Fedora ghc-yesod-core. :)
binutils-2.32-29.fc31 has been pushed to the Fedora 31 stable repository. If problems still persist, please make note of it in this bug report.
binutils-2.31.1-36.fc30 has been pushed to the Fedora 30 stable repository. If problems still persist, please make note of it in this bug report.
I'm seeing something extremely similar under Centos 8 while rebuilding the Xymon package from https://repo.terabithia.org/rpms/xymon/el8/SRPMS/ binutils.x86_64 2.30-73.el8 There doesn't appear to be a more recent binutils for Centos 8. /usr/bin/strip: osdefs.o[.gnu.build.attributes.hot]: Warning: version note missing - assuming version 3 /usr/bin/strip: osdefs.o[.gnu.build.attributes.startup]: Warning: version note missing - assuming version 3 /usr/bin/strip: osdefs.o[.gnu.build.attributes.exit]: Warning: version note missing - assuming version 3 /usr/bin/strip:/root/rpmbuild/BUILDROOT/xymon-4.3.30-1.el8.x86_64/usr/lib64/stx0dujn/osdefs.o[.gnu.build.attributes.hot]: error: failed to copy merged notes into output: Bad value