Description of problem: The output Section for .gnu.build.attributes is just the relocated concatenation of all the corresponding input Sections, without any merging of the same attributes for adjacent address ranges. So if there are N separate compilation units all compiled by the same compiler configuration, then the output Section will be N copies of the same .gnu.build.attributes. That wastes space as N grows, such as in a large project or when using larger .a archive libraries. Also, the static binder "ld" is the program that handles alignment, and is the only program that knows when two address ranges become contiguous because of alignment. For example (reported by "readelf --wide --notes"): GA$<tool>gcc 8.0.1 20180131 0x00000000 OPEN Applies to region from 0x10580 to 0x1178a GA$<tool>gcc 8.0.1 20180131 0x00000000 OPEN Applies to region from 0x11790 to 0x119a9 No program other than ld knows that the range from 0x1178a to 0x11790 has been "bridged" by alignment. This means that all the annobin checkers (built-by.sh, check-abi.sh, etc.) are working with incomplete information: the range 0x1178a to 0x11790 could be occupied by code that was not produced by gcc 8.0.1 20180131. readelf (binutils-2.29.1-19.fc28.x86_64) does not diagnose such a case. Version-Release number of selected component (if applicable): binutils-2.29.1-19.fc28.x86_64 How reproducible: every time Steps to Reproduce: 1. readelf --wide --notes /bin/date ## /bin/date from coreutils-8.29-3.fc28.x86_64 2. 3. Actual results: 31 copies of the 12 lines ===== GA$<version>3p4 0x00000010 OPEN Applies to region from 0x3760 to 0x3ef7 GA$<tool>gcc 8.0.1 20180131 0x00000000 OPEN Applies to region from 0x3760 to 0x3ef7 GA*GOW:0x452a 0x00000000 OPEN Applies to region from 0x3760 to 0x3ef7 GA*<stack prot>strong 0x00000000 OPEN Applies to region from 0x3760 to 0x3ef7 GA+stack_clash:true 0x00000000 OPEN Applies to region from 0x3760 to 0x3ef7 GA*cf_protection:0x8 0x00000000 OPEN Applies to region from 0x3760 to 0x3ef7 GA+GLIBCXX_ASSERTIONS:true 0x00000000 OPEN Applies to region from 0x3760 to 0x3ef7 GA*FORTIFY:0x2 0x00000000 OPEN Applies to region from 0x3760 to 0x3ef7 GA*<PIC>pic 0x00000000 OPEN Applies to region from 0x3760 to 0x3ef7 GA!<short enum>false 0x00000000 OPEN Applies to region from 0x3760 to 0x3ef7 GA*<ABI>0x7001100000012 0x00000000 OPEN Applies to region from 0x3760 to 0x3ef7 GA*cet status:0x2020102 0x00000000 OPEN Applies to region from 0x3760 to 0x3ef7 ===== with a different address range per copy. Expected results: Merge contiguous ranges (after alignment) of the same attribute. In this case there would be oly 12 lines, with the range from the minimum to the maximum address. Additional info: There are also empty ranges, such as: GA$<version>3p4 0x00000010 OPEN Applies to region from 0x3ef7 to 0x3ef7 and unbounded ranges, such as: GA$<tool>gcc 8.0.1 20180131 0x00000000 OPEN Applies to region from 0x3ef7 but I suppose those are the fault of -fplugin==annobin at compilation, or of expansion by readelf.
Hi John, The notes can be merged by the objcopy program. It has a special option: --merge-notes to do exactly this. The ability was not put into the linker in order to keep things simple - ie less chance of introducing bugs. It also means that older versions of the linker can still correctly process files containing the annobin notes. Cheers Nick
Is objdump --merge-notes invoked during rpm post-processing?
(In reply to Jakub Jelinek from comment #2) > Is objdump --merge-notes invoked during rpm post-processing? I do not think so. Maybe it should be, but I think that there are enough new things in the build process at the moment, so leaving it out (for now) is not such a bad thing. Note - the annobin notes are in a non-loadable section, so they do not take up space in the run-time image, only the on-disk image. Cheers Nick
Hi John, (In reply to John Reiser from comment #0) > Also, the static binder "ld" is the program that handles alignment, and is > the only program that knows when two address ranges become contiguous > because of alignment. For example (reported by "readelf --wide --notes"): > GA$<tool>gcc 8.0.1 20180131 0x00000000 OPEN Applies to > region from 0x10580 to 0x1178a > GA$<tool>gcc 8.0.1 20180131 0x00000000 OPEN Applies to > region from 0x11790 to 0x119a9 > No program other than ld knows that the range from 0x1178a to 0x11790 has > been "bridged" by alignment. This means that all the annobin checkers > (built-by.sh, check-abi.sh, etc.) are working with incomplete information: > the range 0x1178a to 0x11790 could be occupied by code that was not produced > by gcc 8.0.1 20180131. Seriously ? You are worried about the case where 6 bytes of unannotated code might have been included in an executable's code space ? Theoretically possible true, but is it really worth worrying about ? Currently readelf will detect and warn about gaps of 16-bytes or larger in the code coverage of annobin notes. I could reduce that, but in order to prevent false positives when there is a real linker-inserted alignment adjustment, the code would then have to check to see if there were any symbols in the adjustment area. Which would make readelf slower and would not help if the unannotated code did not contain any symbols. I should also note that I am also working on an enhancement to the assembler, such that any time it creates an object file which does not contain any annobin notes, it automatically adds a note of its own, saying basically: "this unannotated region came from assembler input 'foo'". It may also include the assembler command line options as well. I am ot sure if that is needed or not at the moment. Cheers Nick
Hi Nick, (In reply to Nick Clifton from comment #4) > Seriously ? You are worried about the case where 6 bytes of unannotated > code might have been included in an executable's code space ? Theoretically > possible, but is it really worth worrying about ? 6 to 15 bytes is enough to allow damage, particularly when there is such a region adjacent to the code for most compilation units. Here's an idea: When ld generates bytes because of alignment, then ld could extend the previous region in .gnu.build.attributes so as to cover those bytes (as long as the new bytes are the only extra bytes.) That allows making any subsequent adjacency/contiguity checking stronger. It does "paper over the alignment holes", although this is controlled by the designation of filler bytes in the [usually defaulted] linker script.
This bug appears to have been reported against 'rawhide' during the Fedora 28 development cycle. Changing version to '28'.
binutils-2.31.1-17.fc29 has been submitted as an update to Fedora 29. https://bodhi.fedoraproject.org/updates/FEDORA-2019-ba3cbcfd20
binutils-2.31.1-17.fc29 has been pushed to the Fedora 29 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2019-ba3cbcfd20
binutils-2.31.1-17.fc29 has been pushed to the Fedora 29 stable repository. If problems still persist, please make note of it in this bug report.