Latest qt5-qtwebkit build has sizes: 769M qt5-qtwebkit-5.212.0-0.26.alpha2.fc29.x86_64.rpm 1.7M qt5-qtwebkit-debuginfo-5.212.0-0.26.alpha2.fc29.x86_64.rpm link: https://koji.fedoraproject.org/koji/taskinfo?taskID=28571109 Compared to previous build: 14M qt5-qtwebkit-5.212.0-0.25.alpha2.fc29.x86_64.rpm 753M qt5-qtwebkit-debuginfo-5.212.0-0.25.alpha2.fc29.x86_64.rpm link: https://koji.fedoraproject.org/koji/buildinfo?buildID=1123617 find-debuginfo.sh in the latest build failed to process lib /usr/lib64/libQt5WebKit.so.5.212.0 for some reason.
Sounds funny. Except it isn't. Mark, might this be some problem of calling eu-uncompress in find-debuginfo?
Some scratch builds of both Releases: [13:59] <lupinix> https://koji.fedoraproject.org/koji/taskinfo?taskID=28602086 [13:59] <lupinix> https://koji.fedoraproject.org/koji/taskinfo?taskID=28602067 to verify if it's qt5-qtwebkit changes or something in the build environment that contributed to the change in output.
(In reply to Igor Gnatenko from comment #1) > Sounds funny. Except it isn't. > > Mark, might this be some problem of calling eu-uncompress in find-debuginfo? No, I don't think so. This build looks like it was done before find-debuginfo.sh started calling eu-elfcompress. At least the build.log doesn't show any calls.
The scratch builds are both failing the same way, so something else in the buildroot environment besides qt5-qtwebkit is (likely?) contributing to the issue. Re-assigning to rpm (owner of rpm-build and find-debuginfo.sh) for now (and cc'ing kdudka, 'file' maintainer).
The difference between the build.logs is that the old one has: + /usr/lib/rpm/find-debuginfo.sh -j6 --strict-build-id -m -i --build-id-seed 5.212.0-0.25.alpha2.fc29 --unique-debug-suffix -5.212.0-0.25.alpha2.fc29.x86_64 --unique-debug-src-base qt5-qtwebkit-5.212.0-0.25.alpha2.fc29.x86_64 --run-dwz --dwz-low-mem-die-limit 10000000 --dwz-max-die-limit 110000000 -S debugsourcefiles.list /builddir/build/BUILD/qtwebkit-5.212.0-alpha2 extracting debug info from /builddir/build/BUILDROOT/qt5-qtwebkit-5.212.0-0.25.alpha2.fc29.x86_64/usr/lib64/libQt5WebKit.so.5.212.0 extracting debug info from /builddir/build/BUILDROOT/qt5-qtwebkit-5.212.0-0.25.alpha2.fc29.x86_64/usr/lib64/libQt5WebKitWidgets.so.5.212.0 extracting debug info from /builddir/build/BUILDROOT/qt5-qtwebkit-5.212.0-0.25.alpha2.fc29.x86_64/usr/lib64/qt5/libexec/QtWebPluginProcess extracting debug info from /builddir/build/BUILDROOT/qt5-qtwebkit-5.212.0-0.25.alpha2.fc29.x86_64/usr/lib64/qt5/qml/QtWebKit/experimental/libqmlwebkitexperimentalplugin.so extracting debug info from /builddir/build/BUILDROOT/qt5-qtwebkit-5.212.0-0.25.alpha2.fc29.x86_64/usr/lib64/qt5/qml/QtWebKit/libqmlwebkitplugin.so extracting debug info from /builddir/build/BUILDROOT/qt5-qtwebkit-5.212.0-0.25.alpha2.fc29.x86_64/usr/lib64/qt5/libexec/QtWebProcess extracting debug info from /builddir/build/BUILDROOT/qt5-qtwebkit-5.212.0-0.25.alpha2.fc29.x86_64/usr/lib64/qt5/libexec/QtWebDatabaseProcess extracting debug info from /builddir/build/BUILDROOT/qt5-qtwebkit-5.212.0-0.25.alpha2.fc29.x86_64/usr/lib64/qt5/libexec/QtWebNetworkProcess dwz: ./usr/lib64/libQt5WebKit.so.5.212.0-5.212.0-0.25.alpha2.fc29.x86_64.debug: Too many DIEs, not optimizing /usr/lib/rpm/sepdebugcrcfix: Updated 7 CRC32s, 1 CRC32s did match. While the new one has: + /usr/lib/rpm/find-debuginfo.sh -j6 --strict-build-id -m -i --build-id-seed 5.212.0-0.26.alpha2.fc29 --unique-debug-suffix -5.212.0-0.26.alpha2.fc29.x86_64 --unique-debug-src-base qt5-qtwebkit-5.212.0-0.26.alpha2.fc29.x86_64 --run-dwz --dwz-low-mem-die-limit 10000000 --dwz-max-die-limit 110000000 -S debugsourcefiles.list /builddir/build/BUILD/qtwebkit-5.212.0-alpha2 extracting debug info from /builddir/build/BUILDROOT/qt5-qtwebkit-5.212.0-0.26.alpha2.fc29.x86_64/usr/lib64/qt5/qml/QtWebKit/experimental/libqmlwebkitexperimentalplugin.so extracting debug info from /builddir/build/BUILDROOT/qt5-qtwebkit-5.212.0-0.26.alpha2.fc29.x86_64/usr/lib64/qt5/libexec/QtWebProcess extracting debug info from /builddir/build/BUILDROOT/qt5-qtwebkit-5.212.0-0.26.alpha2.fc29.x86_64/usr/lib64/qt5/libexec/QtWebNetworkProcess extracting debug info from /builddir/build/BUILDROOT/qt5-qtwebkit-5.212.0-0.26.alpha2.fc29.x86_64/usr/lib64/qt5/libexec/QtWebDatabaseProcess extracting debug info from /builddir/build/BUILDROOT/qt5-qtwebkit-5.212.0-0.26.alpha2.fc29.x86_64/usr/lib64/qt5/qml/QtWebKit/libqmlwebkitplugin.so extracting debug info from /builddir/build/BUILDROOT/qt5-qtwebkit-5.212.0-0.26.alpha2.fc29.x86_64/usr/lib64/libQt5WebKitWidgets.so.5.212.0 extracting debug info from /builddir/build/BUILDROOT/qt5-qtwebkit-5.212.0-0.26.alpha2.fc29.x86_64/usr/lib64/qt5/libexec/QtWebPluginProcess /usr/lib/rpm/sepdebugcrcfix: Updated 7 CRC32s, 0 CRC32s did match. So, for some reason I don't know, the new one doesn't find libQt5WebKit.so. So it doesn't even try to extract the debuginfo.
%undefine _annotated_build apparently fixes things, so bouncing this over to annobin next
Well, there's some...contra-indications there too. I happened to come across a very similar case in llvm6.0 recently (it also broke a compose, in fact). In llvm6.0-6.0.1-3.fc29 , stripping of libLLVM-6.0.so was missed. The packager noticed this and sent a straight rebuild (no other change) as llvm6.0-6.0.1-4.fc29; in that build, stripping of libLLVM-6.0.so worked. Note, in both this and the qt5-qtwebkit cases, when this went wrong, it went wrong on *all arches* in the affected build; in builds where it worked OK, it worked OK on *all arches* in the affected build. So, we can compare two bad builds - llvm6.0-6.0.1-3.fc29 and qt5-qtwebkit-5.212.0-0.26.alpha2.fc29 - with two good builds - qt5-qtwebkit-5.212.0-0.25.alpha2.fc29 and llvm6.0-6.0.1-4.fc29 , and we get this: BUILD DATE GOOD/BAD ANNOBIN RPM qt5-qtwebkit-5.212.0-0.25.alpha2.fc29 2018-07-15 GOOD 8.10-1.fc29 4.14.2-0.rc1.1.fc29.1 llvm6.0-6.0.1-3.fc29 2018-07-19 BAD 8.14-1.fc29 4.14.2-0.rc1.1.fc29.2 llvm6.0-6.0.1-4.fc29 2018-07-23 GOOD 8.18-1.fc29 4.14.2-0.rc1.1.fc29.2 qt5-qtwebkit-5.212.0-0.26.alpha2.fc29 2018-07-24 BAD 8.18-1.fc29 4.14.2-0.rc1.1.fc29.2 notably, we had both a good and a bad build done with *the same* annobin and rpm versions...
Looking at find-debuginfo.sh , we should at least see the "extracting debug info from <file>" message so long as we get into `do_file()` for the file, and this test fails (i.e. does not hit 'return' on the second line): get_debugfn "$f" [ -f "${debugfn}" ] && return so, we wind up looking at this chunk: == # 16^6 - 1 or about 16 million files FILENUM_DIGITS=6 run_job() { local jobid=$1 filenum local SOURCEFILE=$temp/debugsources.$jobid ELFBINSFILE=$temp/elfbins.$jobid >"$SOURCEFILE" >"$ELFBINSFILE" # can't use read -n <n>, because it reads bytes one by one, allowing for # races while :; do filenum=$(dd bs=$(( FILENUM_DIGITS + 1 )) count=1 status=none) if test -z "$filenum"; then break fi do_file $(sed -n "$(( 0x$filenum )) p" "$temp/primary") done echo 0 >"$temp/res.$jobid" } n_files=$(wc -l <"$temp/primary") if [ $n_jobs -gt $n_files ]; then n_jobs=$n_files fi if [ $n_jobs -le 1 ]; then while read nlinks inum f; do do_file "$nlinks" "$inum" "$f" done <"$temp/primary" else for ((i = 1; i <= n_files; i++)); do printf "%0${FILENUM_DIGITS}x\\n" $i done | ( exec 3<&0 for ((i = 0; i < n_jobs; i++)); do # The shell redirects stdin to /dev/null for background jobs. Work # around this by duplicating fd 0 run_job $i <&3 & done wait ) for f in "$temp"/res.*; do res=$(< "$f") if [ "$res" != "0" ]; then exit 1 fi done cat "$temp"/debugsources.* >"$SOURCEFILE" cat "$temp"/elfbins.* >"$ELFBINSFILE" fi == the *lower* chunk there, depending on the `if [ $n_jobs -le 1 ]; then`, either calls `do_file` directly, or calls `run_job` (the upper chunk), which *then* calls `do_file` after doing some...other stuff. This is all pretty hard to read for me at least and I can't pretend to know what it's all doing, but it seems like it's breaking down in here *somewhere*. Note that this chunk is using "$temp/primary" as a source, which seems to be built by this chunk earlier: == # Build a list of unstripped ELF files and their hardlinks touch "$temp/primary" find "$RPM_BUILD_ROOT" ! -path "${debugdir}/*.debug" -type f \ \( -perm -0100 -or -perm -0010 -or -perm -0001 \) \ -print | file -N -f - | sed -n -e 's/^\(.*\):[ ]*.*ELF.*, not stripped.*/\1/p' | xargs --no-run-if-empty stat -c '%h %D_%i %n' | while read nlinks inum f; do if [ $nlinks -gt 1 ]; then var=seen_$inum if test -n "${!var}"; then echo "$inum $f" >>"$temp/linked" continue else read "$var" < <(echo 1) fi fi echo "$nlinks $inum $f" >>"$temp/primary" done == so I think it's somewhere in these bits that stuff's breaking down. It might be useful, I guess, to run a build with find-debuginfo.sh patched to output a bit more debugging info in these chunks.
It doesn't really explain why find-debuginfo.sh doesn't even seem to find the shared library (might it somehow have lost its executable bit?) but looking at libQt5WebKit.so.5.212.0 I see it has an impressive amount of sections. There are 84562 .gnu.build.attributes NOTE sections. That does mean the ELF file will need to use some extensions to represent the number of sections (the ELF header fields e_shnum and e_shstrndx can only contain up to 65280 entries). I wouldn't be surprised if some tools would get confused by this (or haven't been tested to represent so many sections).
We don't know whether it finds it or not, really, because it's not very expressive. From the messages we get, we can't be *definitively* sure what's going on, because the script just doesn't log much: I think we can only conclude what I concluded in my comment, that the file doesn't make it into the list of files to be processed at all *or* that we get lost somewhere between that list and `do_file()` *or* do_file runs but decides to return because it thinks this `debugfn` file already exists. It's only *after* that point that we can be sure we would've seen some output. The last chunk I quoted is the part which "finds" files to process, but it doesn't log anything and its output file is temporary and not logged or stored anywhere after the build concludes. If we want to be sure whether it 'found' the library at all we need to patch the script to log or dump the contents of this "primary" file somewhere, once it's constructed it.
Made several new scratch builds of qt5-qtwebkit. With disabled annobin binary size is always fine, with enabled annobin always huge as libQt5WebKit.so.5.212.0 has not been stripped. All of these with annobin-8.19-1.fc29. For now I disabled annobin in qt5-qtwebkit and rebuilt: https://koji.fedoraproject.org/koji/buildinfo?buildID=1130736 This is required as otherwise the size of all KDE/Plasma based spins increases by ~750 MB, which causes composes failures due to filesystem out of space. I'm wondering whether other packages are affected too.
(In reply to Christian Dersch from comment #11) > Made several new scratch builds of qt5-qtwebkit. With disabled annobin > binary size is always fine, with enabled annobin always huge as > libQt5WebKit.so.5.212.0 has not been stripped. All of these with > annobin-8.19-1.fc29. Can I ask, if disk space is such an issue, why do you not strip the file ? I found that the stripped file went from 3.1Gb to 82Mb. Even if you only strip out the debug information (into a separate file maybe ?) the size was reduced to 102Mb. And this was with annobin enabled.
That is the question ;) When annobin is enabled, it does not get stripped for *some* reason, not by intention.
(In reply to Nick Clifton from comment #12) > Can I ask, if disk space is such an issue, why do you not strip the file ? > I found that the stripped file went from 3.1Gb to 82Mb. Even if you only > strip out the debug information (into a separate file maybe ?) the size > was reduced to 102Mb. And this was with annobin enabled. I think there are at least 2 issues at play. For some reason the file doesn't get stripped. As Adam pointed out in comment #8 and comment #10 we need more logging in find-debuginfo.sh to figure out if it really isn't being processed or it is something else. And as comment #10 says, if you do strip the file manually with eu-strip, you'll notice that it gets slightly upset about the 80000+ sections in the file and the stripped and debug file have corrupted section string names (the e_shstrndx and section zero sh_link field aren't setup correctly). I am fixing issue 2. But it might be that we also need to solve issue 1 (which we don't fully understand yet).
FYI - I have made a change to the binutils so that the multiple sections generated by annobin will be combined into just one section in the final executable. This only works with the bfd based linker however (ld.bfd) and not gold. I am investigating how a similar fix can be applied to gold.
Some interaction between annobin and 'file' on how it identifies files. Note find-debuginfo.sh only runs on objects that satisfy per this snippet: ... file -N -f - | sed -n -e 's/^\(.*\):[ ]*.*ELF.*, not stripped.*/\1/p' | ... I added some debugging in a scratch build, on a run where annobin runs, + file -N /builddir/build/BUILDROOT/qt5-qtwebkit-5.212.0-0.27.alpha2.fc29.x86_64/usr/lib64/libQt5WebKit.so.5.212.0 /builddir/build/BUILDROOT/qt5-qtwebkit-5.212.0-0.27.alpha2.fc29.x86_64/usr/lib64/libQt5WebKit.so.5.212.0: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), So for some reason, when annobin is enabled, it makes 'file' think libQt5WebKit is no longer 'not stripped'
(In reply to Rex Dieter from comment #16) > Some interaction between annobin and 'file' on how it identifies files. > > Note find-debuginfo.sh only runs on objects that satisfy per this snippet: > ... > file -N -f - | sed -n -e 's/^\(.*\):[ ]*.*ELF.*, not stripped.*/\1/p' | > ... > So for some reason, when annobin is enabled, it makes 'file' think > libQt5WebKit is no longer 'not stripped' That is a nice find. Indeed file thinks libQt5WebKit.so.5.212.0 is already stripped. I think this is because it is reading the ELF file and just using e_shnum, which is 0 because this file has too many sections (it should fetch the section zero sh_info field as extension mechanism), and so might not find the SHT_SYMTAB at all.
yep, just read the file sources. It has: shnum = elf_getu16(swap, elfhdr.e_shnum); if (shnum > ms->elf_shnum_max) return toomany(ms, "section headers", shnum); if (doshn(ms, clazz, swap, fd, (off_t)elf_getu(swap, elfhdr.e_shoff), shnum, (size_t)elf_getu16(swap, elfhdr.e_shentsize), fsize, elf_getu16(swap, elfhdr.e_machine), (int)elf_getu16(swap, elfhdr.e_shstrndx), &flags, ¬ecount) == -1) return -1; Note that it just reads elfhdr.e_shnum and elfhdr.e_shstrndx directly. But if either number is bigger than 0xff00 (65280) then it should use the extension mechanism (reading section zero sh_info and sh_link) to get the real number of sections or the index of the section that holds the section names. Also it seems to have a default elf_shnum_max of 32768.
So, should we move this to file ? Or is annobin also doing something wrong here?
(In reply to Adam Williamson from comment #19) > So, should we move this to file ? Or is annobin also doing something wrong > here? I think it is now multiple bugs across multiple components :) annobin is technically not doing anything wrong, but it does trigger limits some of the other tools cannot handle. To workaround this the linker can help reduce the number of NOTES/sections by combining what annobin produces. I believe nickc already has a fix for binutils to do this and is working on one for gold. Then file does keep giving us trouble when used as "ELF file identifier", file could be fixed, but I think rpm should stop relying on file and just have its own small specific ELF file id utility. I could write that and update the rpm scripts to use that instead of trying to regex match file output. Then there is elfutils eu-strip not dealing correctly with such large number of sections/having the shstrtab section number > 65280. I am working on a fix for that.
Thanks! I'll try and split all those out separately, then. Just a note for the "small specific ELF file id utility" - try and write it with as few dependencies as possible, ideally stuff that's only in the build root already, as it'll need to be in the buildroot itself.
Yes, using the file(1) utility to query details about ELF binaries is fragile. This functionality was badly broken in the last two upstream releases of file: bug #1570246 bug #1581343 bug #1608373 comment #2
So I've changed this to be the bug for 'file', and filed https://bugzilla.redhat.com/show_bug.cgi?id=1609013 for the 'write an ELF file id tool' suggestion. Do we also need separate bugs for: 1) making annobin reduce the number of sections (it sounds like this is already underway anyway) 2) The elfutils eu-strip thing (or is that the same as https://bugzilla.redhat.com/show_bug.cgi?id=1609013 , or another bug)? thanks!
Where can I get the file in question? Does it say that the binary is stripped? If not, then file's output is technically correct I think. There has never been any guarantee that file(1) will list all properties of all possible ELF files.
You can get a copy of it from one of the problematic builds, I think: https://kojipkgs.fedoraproject.org//packages/qt5-qtwebkit/5.212.0/0.26.alpha2.fc29/x86_64/qt5-qtwebkit-5.212.0-0.26.alpha2.fc29.x86_64.rpm , file /usr/lib64/libQt5WebKit.so.5.212.0 . Note that's *huge*, like 3.2GB. You could get the i686 one if you want one somewhat smaller: https://kojipkgs.fedoraproject.org//packages/qt5-qtwebkit/5.212.0/0.26.alpha2.fc29/i686/qt5-qtwebkit-5.212.0-0.26.alpha2.fc29.i686.rpm , file /usr/lib/libQt5WebKit.so.5.212.0 . Whether it's "technically correct" or not, per Mark's comment, there are apparently clear bugs in the relevant code that result in the output not being as expected. "Note that it just reads elfhdr.e_shnum and elfhdr.e_shstrndx directly. But if either number is bigger than 0xff00 (65280) then it should use the extension mechanism (reading section zero sh_info and sh_link) to get the real number of sections or the index of the section that holds the section names."
Sounds like a task for elfutils. file(1) is a utility that determines file type. You need to use suitable tools to get expected results. In any case, patches are welcome.
If you read up, you'll see that we've already talked about that and filed a bug for it. But if 'file' is doing something and doing it wrong, that's clearly also a bug in file.
FYI - both the bfd linker (ld.bfd) and the gold linker (ld.gold) are now fixed so that they will combine these thousands of .gnu.build.attribute.* sections into just one section in the output file. This is in binutils-2.31.1-5.fc29, which is building now and should be finished soon. (I would wait for the build to complete, but it is late and I am very tired).
(In reply to Adam Williamson from comment #23) > So I've changed this to be the bug for 'file', and filed > https://bugzilla.redhat.com/show_bug.cgi?id=1609013 for the 'write an ELF > file id tool' suggestion. Thanks. > Do we also need separate bugs for: > > 1) making annobin reduce the number of sections (it sounds like this is > already underway anyway) I am afraid this has caused: https://bugzilla.redhat.com/show_bug.cgi?id=1609069 > 2) The elfutils eu-strip thing (or is that the same as > https://bugzilla.redhat.com/show_bug.cgi?id=1609013 , or another bug)? I saw you already found https://bugzilla.redhat.com/show_bug.cgi?id=1608390 which is the elfutils/eu-strip bug now.
(In reply to Kamil Dudka from comment #24) > Does it say that the binary is stripped? Yes, it does: $ rpm -q file{,-libs} file-5.34-1.fc29.x86_64 file-libs-5.34-1.fc29.x86_64 $ curl -O https://kojipkgs.fedoraproject.org//packages/qt5-qtwebkit/5.212.0/0.26.alpha2.fc29/x86_64/qt5-qtwebkit-5.212.0-0.26.alpha2.fc29.x86_64.rpm $ rpmdev-extract qt5-qtwebkit-5.212.0-0.26.alpha2.fc29.x86_64.rpm $ file qt5-qtwebkit-5.212.0-0.26.alpha2.fc29.x86_64/usr/lib64/libQt5WebKit.so.5.212.0 qt5-qtwebkit-5.212.0-0.26.alpha2.fc29.x86_64/usr/lib64/libQt5WebKit.so.5.212.0: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), BuildID[sha1]=0f1936787780c9bd816b55b9f0cfc7a6d30bc7a9, dynamically linked, stripped
This bug appears to have been reported against 'rawhide' during the Fedora 29 development cycle. Changing version to '29'.
Appears fixed (does not print "stripped" any more) in rawhide: $ rpm -q file{,-libs} file-5.35-2.fc30.x86_64 file-libs-5.35-2.fc30.x86_64 $ file libQt5WebKit.so.5.212.0 libQt5WebKit.so.5.212.0: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), BuildID[sha1]=0f1936787780c9bd816b55b9f0cfc7a6d30bc7a9, dynamically linked, no section header