Red Hat Bugzilla – Bug 1251698
elfutils testsuite fails because ld places NOBITS .plt in the middle of a PT_LOAD segment on ppc64
Last modified: 2015-11-19 05:19:50 EST
Description of problem: ========================================== elfutils 0.163: tests/test-suite.log ========================================== # TOTAL: 139 # PASS: 135 # SKIP: 3 # XFAIL: 0 # FAIL: 1 # XPASS: 0 # ERROR: 0 .. contents:: :depth: 2 FAIL: run-elflint-self.sh ========================= section [23] '.plt' has type NOBITS but is read from the file in segment of program header entry 3 *** failure in /root/rpmbuild/BUILD/elfutils-0.163/src/elflint --quiet --gnu-ld /root/rpmbuild/BUILD/elfutils-0.163/src/addr2line section [24] '.plt' has type NOBITS but is read from the file in segment of program header entry 3 *** failure in /root/rpmbuild/BUILD/elfutils-0.163/src/elflint --quiet --gnu-ld /root/rpmbuild/BUILD/elfutils-0.163/src/elfcmp section [24] '.plt' has type NOBITS but is read from the file in segment of program header entry 3 *** failure in /root/rpmbuild/BUILD/elfutils-0.163/src/elflint --quiet --gnu-ld /root/rpmbuild/BUILD/elfutils-0.163/src/elflint section [24] '.plt' has type NOBITS but is read from the file in segment of program header entry 3 *** failure in /root/rpmbuild/BUILD/elfutils-0.163/src/elflint --quiet --gnu-ld /root/rpmbuild/BUILD/elfutils-0.163/src/nm section [24] '.plt' has type NOBITS but is read from the file in segment of program header entry 3 *** failure in /root/rpmbuild/BUILD/elfutils-0.163/src/elflint --quiet --gnu-ld /root/rpmbuild/BUILD/elfutils-0.163/src/objdump section [24] '.plt' has type NOBITS but is read from the file in segment of program header entry 3 *** failure in /root/rpmbuild/BUILD/elfutils-0.163/src/elflint --quiet --gnu-ld /root/rpmbuild/BUILD/elfutils-0.163/src/readelf section [23] '.plt' has type NOBITS but is read from the file in segment of program header entry 1 *** failure in /root/rpmbuild/BUILD/elfutils-0.163/src/elflint --quiet --gnu-ld /root/rpmbuild/BUILD/elfutils-0.163/libelf/libelf.so section [23] '.plt' has type NOBITS but is read from the file in segment of program header entry 1 *** failure in /root/rpmbuild/BUILD/elfutils-0.163/src/elflint --quiet --gnu-ld /root/rpmbuild/BUILD/elfutils-0.163/libdw/libdw.so FAIL run-elflint-self.sh (exit status: 1) Version-Release number of selected component (if applicable): elfutils-0.163-1.el7.ppc64 gcc-4.8.5-4.el7.ppc64 binutils-2.23.52.0.1-54.el7.ppc64 How reproducible: Steps to Reproduce: 1. build the SRPM 2. run the testsuite 3. Actual results: It fails. Expected results: The testsuite passes. Additional info: # /root/rpmbuild/BUILD/elfutils-0.163/src/elflint --quiet --gnu-ld /root/rpmbuild/BUILD/elfutils-0.163/src/addr2line invalid machine flags: 0x1 section [ 9] '.rela.dyn': relocation 0: invalid type section [ 9] '.rela.dyn': relocation 1: invalid type section [10] '.rela.plt': relocation 0: invalid type section [10] '.rela.plt': relocation 1: invalid type section [10] '.rela.plt': relocation 2: invalid type section [10] '.rela.plt': relocation 3: invalid type section [10] '.rela.plt': relocation 4: invalid type section [10] '.rela.plt': relocation 5: invalid type section [10] '.rela.plt': relocation 6: invalid type section [10] '.rela.plt': relocation 7: invalid type section [10] '.rela.plt': relocation 8: invalid type section [10] '.rela.plt': relocation 9: invalid type section [10] '.rela.plt': relocation 10: invalid type section [10] '.rela.plt': relocation 11: invalid type section [10] '.rela.plt': relocation 12: invalid type section [10] '.rela.plt': relocation 13: invalid type section [10] '.rela.plt': relocation 14: invalid type section [10] '.rela.plt': relocation 15: invalid type section [10] '.rela.plt': relocation 16: invalid type section [10] '.rela.plt': relocation 17: invalid type section [10] '.rela.plt': relocation 18: invalid type section [10] '.rela.plt': relocation 19: invalid type section [10] '.rela.plt': relocation 20: invalid type section [10] '.rela.plt': relocation 21: invalid type section [10] '.rela.plt': relocation 22: invalid type section [10] '.rela.plt': relocation 23: invalid type section [10] '.rela.plt': relocation 24: invalid type section [10] '.rela.plt': relocation 25: invalid type section [10] '.rela.plt': relocation 26: invalid type section [10] '.rela.plt': relocation 27: invalid type section [10] '.rela.plt': relocation 28: invalid type section [10] '.rela.plt': relocation 29: invalid type section [10] '.rela.plt': relocation 30: invalid type section [10] '.rela.plt': relocation 31: invalid type section [10] '.rela.plt': relocation 32: invalid type section [10] '.rela.plt': relocation 33: invalid type section [10] '.rela.plt': relocation 34: invalid type section [10] '.rela.plt': relocation 35: invalid type section [10] '.rela.plt': relocation 36: invalid type section [10] '.rela.plt': relocation 37: invalid type section [10] '.rela.plt': relocation 38: invalid type section [10] '.rela.plt': relocation 39: invalid type section [10] '.rela.plt': relocation 40: invalid type section [10] '.rela.plt': relocation 41: invalid type section [10] '.rela.plt': relocation 42: invalid type section [10] '.rela.plt': relocation 43: invalid type section [10] '.rela.plt': relocation 44: invalid type section [10] '.rela.plt': relocation 45: invalid type section [10] '.rela.plt': relocation 46: invalid type section [10] '.rela.plt': relocation 47: invalid type section [10] '.rela.plt': relocation 48: invalid type section [10] '.rela.plt': relocation 49: invalid type section [10] '.rela.plt': relocation 50: invalid type section [10] '.rela.plt': relocation 51: invalid type section [10] '.rela.plt': relocation 52: invalid type section [10] '.rela.plt': relocation 53: invalid type section [10] '.rela.plt': relocation 54: invalid type section [10] '.rela.plt': relocation 55: invalid type section [10] '.rela.plt': relocation 56: invalid type section [10] '.rela.plt': relocation 57: invalid type section [10] '.rela.plt': relocation 58: invalid type section [10] '.rela.plt': relocation 59: invalid type section [10] '.rela.plt': relocation 60: invalid type section [10] '.rela.plt': relocation 61: invalid type section [10] '.rela.plt': relocation 62: invalid type section [10] '.rela.plt': relocation 63: invalid type section [10] '.rela.plt': relocation 64: invalid type section [10] '.rela.plt': relocation 65: invalid type section [20] '.dynamic': entry 20: unknown tag section [23] '.plt' has wrong type: expected PROGBITS, is NOBITS section [23] '.plt' has type NOBITS but is read from the file in segment of program header entry 3 section [36] '.symtab': symbol 36 does not fit completely in referenced section [21] '.opd' section [36] '.symtab': symbol 38 does not fit completely in referenced section [21] '.opd' section [36] '.symtab': symbol 39 does not fit completely in referenced section [21] '.opd' section [36] '.symtab': symbol 40 does not fit completely in referenced section [21] '.opd' section [36] '.symtab': symbol 41 does not fit completely in referenced section [21] '.opd' section [36] '.symtab': symbol 42 does not fit completely in referenced section [21] '.opd' section [36] '.symtab': symbol 43 does not fit completely in referenced section [21] '.opd' section [36] '.symtab': symbol 44 does not fit completely in referenced section [21] '.opd' section [36] '.symtab': symbol 186 does not fit completely in referenced section [21] '.opd' section [36] '.symtab': symbol 231 does not fit completely in referenced section [21] '.opd' It might be related to the recent binutils bug bz1247126.
I think you are right that this might be related to bug #1247126 But the testsuite is also ran when the package is build and must pass then (zero fail) or the build will abort. So it would be good to know if the package was build against a newer or older binutils. The additional information shows something different I think. It seems to not know about the relocation types, which indicates that it couldn't find the backend for this architecture. Maybe you could run it with: LD_LIBRARY_PATH=/root/rpmbuild/BUILD/elfutils-0.163/backends:/root/rpmbuild/BUILD/elfutils-0.163/libelf:/root/rpmbuild/BUILD/elfutils-0.163/libdw /root/rpmbuild/BUILD/elfutils-0.163/src/elflint --quiet --gnu-ld /root/rpmbuild/BUILD/elfutils-0.163/src/addr2line
Thanks for pointing to it. But even with the LD_LIBRARY_PATH I get an error: # LD_LIBRARY_PATH=/root/rpmbuild/BUILD/elfutils-0.163/backends:/root/rpmbuild/BUILD/elfutils-0.163/libelf:/root/rpmbuild/BUILD/elfutils-0.163/libdw /root/rpmbuild/BUILD/elfutils-0.163/src/elflint --quiet --gnu-ld /root/rpmbuild/BUILD/elfutils-0.163/src/addr2line section [23] '.plt' has type NOBITS but is read from the file in segment of program header entry 3 # echo $? 1
I got this failure when I rebuilt elfutils with these installed: elfutils-0.163-1.el7.ppc64 gcc-4.8.5-4.el7.ppc64 binutils-2.23.52.0.1-54.el7.ppc64
(In reply to Michael Petlan from comment #3) > Thanks for pointing to it. But even with the LD_LIBRARY_PATH I get an error: > > # > LD_LIBRARY_PATH=/root/rpmbuild/BUILD/elfutils-0.163/backends:/root/rpmbuild/ > BUILD/elfutils-0.163/libelf:/root/rpmbuild/BUILD/elfutils-0.163/libdw > /root/rpmbuild/BUILD/elfutils-0.163/src/elflint --quiet --gnu-ld > /root/rpmbuild/BUILD/elfutils-0.163/src/addr2line > > section [23] '.plt' has type NOBITS but is read from the file in segment of > program header entry 3 > > # echo $? > > 1 Thanks. Yes, makes sense this actual bug is flagged. Good to see the other issues are gone. For reference could you print the program header of the file with: eu-readelf -l /root/rpmbuild/BUILD/elfutils-0.163/src/addr2line Just to verify eu-elflint is right.
(In reply to Michael Petlan from comment #4) > I got this failure when I rebuilt elfutils with these installed: > > elfutils-0.163-1.el7.ppc64 > gcc-4.8.5-4.el7.ppc64 > binutils-2.23.52.0.1-54.el7.ppc64 That is somewhat odd. The original binutils bug #1247126 was fixed in 2.23.52.0.1-51 and elfutils was build against 2.23.52.0.1-41. I am not sure when exactly the bug was introduced. But I would have this issue to show up with an older binutils version instead of a newer one. Lets see how the ELF program header looks (eu-readelf -l) and think about what the real root cause is.
O, please show also the section headers, so eu-readelf -Sl /root/rpmbuild/BUILD/elfutils-0.163/src/addr2line
Created attachment 1060736 [details] headers log Attached `eu-readelf -Sl /root/rpmbuild/BUILD/elfutils-0.163/src/addr2line` output. I'll try to rebuild that with various other binutils versions and put here the results.
I did some rebuilds with different binutils version installed. It seems that the problem has been indroduced by the fix in the 51.el7 release. binutils-2.23.52.0.1-50.el7.ppc64.rpm PASS binutils-2.23.52.0.1-51.el7.ppc64.rpm FAIL binutils-2.23.52.0.1-54.el7.ppc64.rpm FAIL
I am adding Jeff Law to the CC, who might know whether this is a "real issue" where binutils generates something buggy. Or whether it is a "sanity issue" which eu-elflint detects, but is in fact harmless. (In reply to Michael Petlan from comment #8) > Created attachment 1060736 [details] > headers log > > Attached `eu-readelf -Sl /root/rpmbuild/BUILD/elfutils-0.163/src/addr2line` > output. Thanks. So we have the following sections: [17] .init_array INIT_ARRAY 000000001001fc60 0000fc60 00000008 0 WA 0 0 8 [18] .fini_array FINI_ARRAY 000000001001fc68 0000fc68 00000008 0 WA 0 0 8 [19] .jcr PROGBITS 000000001001fc70 0000fc70 00000008 0 WA 0 0 8 [20] .dynamic DYNAMIC 000000001001fc78 0000fc78 00000210 16 WA 6 0 8 [21] .opd PROGBITS 000000001001fe88 0000fe88 00000138 0 WA 0 0 8 [22] .got PROGBITS 000000001001ffc0 0000ffc0 00000040 8 WA 0 0 8 [23] .plt NOBITS 0000000010020000 00010000 00000648 24 WA 0 0 8 [24] .data PROGBITS 0000000010020648 00010648 00000004 0 WA 0 0 1 [25] .bss NOBITS 0000000010020650 0001064c 00000078 0 WA 0 0 8 (And after the .bss there is more data in the file [in fact the next section starts at the same file offset as .bss] but that isn't allocated, so has an address of zero.) And we have the following PT_LOAD covering those sections (data in the file): LOAD 0x00fc60 0x000000001001fc60 0x000000001001fc60 0x0009ec 0x000a68 RW 0x10000 What eu-elflint is complaining about is the fact that the .plt section is NOBITS and so doesn't have any file contents, but the PT_LOAD segment is pulling in bytes from the file at that sections offset anyway. Note that this is different from the other NOBITS section covered, the .bss section. That one isn't actually covered by the p_filesz, only by the p_memsz, so no contents is actually read from that section on disk. I think eu-elflint is correct to warn about this because since the section at the .plt section file location is NOBITS there is no guarantee there is anything at that file location, and if there is there is no guarantee that the file contents is actually all zero (which NOBITS implies should be loaded at the sh_addr into memory). This might not be a bug in practice since you can see there is a hole in the ELF file after the .plt section. The next .data section seems to start precisely after a hole in the ELF file. If the hole is filled with zeros then everything will probably work out. But there is no guarantee that the hole between the sections is filled with zeros. Jeff, do you know whether or not this is a real bug, or that the expectation is that there always should be a hole filled with zeros after such a NOBITS section that is big enough to make sure the PT_LOAD maps in zeros for such a section? If so, then why isn't the .plt section just a normal PROGBITS section (that happens to be filled with zeros)? If this isn't considered a bug in binutils and ld.so handles it fine, then I could add extra checks to eu-elflint to see if a NOBITS section is covered by a PT_LOAD then there is a hole big enough to the next section and that the file contents of that hole is actually zero. But I think the more correct solution would be to mark the .plt section as PROGBITS if it is actually backed up with actual file contents.
(In reply to Mark Wielaard from comment #10) > But I think the more > correct solution would be to mark the .plt section as PROGBITS if it is > actually backed up with actual file contents. Or, to group the .plt and .bss sections together at the end of the segment. (I cannot really tell whether a NOBITS section makes any sense unless it is at the end of a segment.)
We've been 'round and 'round on the NOBITS .plt section appearing in the middle of a PT_LOAD segment, followed by PROGBITS sections. Unfortunately, it appears like that behaviour is here to stay due to requirement of the PPC ABI. We have carefully reviewed the behaviour of the static linker to ensure that it zero-fills that NOBITS hunk in the generated file, which in turn allows the dynamic linker to continue to mmap the covering PT_LOAD segment as a single unit. In essence the static linker treats it as a PROGBITS section under the hood, even that wasn't enough to convince the powers that be to change the wording in the ABI. We also discussed potential issues with low level applications such as elfutils, gdb, valgrind, prelink which do not expect a NOBITS section in the middle of a PT_LOAD segment. So I think this oddity is here to stay.
(In reply to Jeff Law from comment #12) > We've been 'round and 'round on the NOBITS .plt section appearing in the > middle of a PT_LOAD segment, followed by PROGBITS sections. Unfortunately, > it appears like that behaviour is here to stay due to requirement of the PPC > ABI. Does it really require that? That seems completely broken and I cannot see how that can be allowed by the ELF/gabi spec. > We have carefully reviewed the behaviour of the static linker to ensure that > it zero-fills that NOBITS hunk in the generated file, which in turn allows > the dynamic linker to continue to mmap the covering PT_LOAD segment as a > single unit. > > In essence the static linker treats it as a PROGBITS section under the hood, > even that wasn't enough to convince the powers that be to change the wording > in the ABI. The problem is that it isn't actually a PROGBITS sections. It is a NOBITS section. So there is no guarantee that there are actually any bits in the file, or if there are any bits that all those bits will be zero. Anything that postprocesses such a file, objcopy, prelink, eu-strip, etc. might put something else there because there is just a gap in the file that looks like it is unused. IMHO the static linker is doing this wrong. If it is an requirement that the .plt section really is NOBITS (gabi says it should be PROGBITS), then it should either group it together with the .bss section at the end of the segment (so there is no gap in the file and the segment only uses defined bits), or if the sections are not grouped together it should create separate PT_LOAD segments with the .plt and the .bss are the end of those segments (again so that no undefined bits of the file are used). I could add a workaround (guarded by --gnu) to eu-elflint to recognize a such a NOBITS section in the middle of a PT_LOAD segment that really is a PROGBITS section because it has a gap in the file that happen to be filled with zeros. But I think that is just papering over the real issue. Anything that might manipulate such an ELF file will most likely cause that gap in the ELF file to disappear. Is there really no way to either fix the ABI or the static linker to not create such a dangerous situation?
I do think this is a bug in the binutils static linker. So lets first see whether they can fix it before adding any extra workarounds to elfutils eu-elflint for it.
Mark, I understand your concerns and conclusions and like Jakub, I'd really prefer to see this change, but that's certainly not going to happen in the RHEL 7.2 timeframe and likely not going to happen, ever. The other tools, for better or worse are going to have to cope with this lameness. And FWIW, this is a guarantee that we'll have zero's for the NOBITS section. The static linker is responsible for zero-filling that area in the object file so that the associated PT_LOAD segment can be efficiently mmap'd.
(In reply to Jeff Law from comment #15) > I understand your concerns and conclusions and like Jakub, I'd really prefer > to see this change, but that's certainly not going to happen in the RHEL 7.2 > timeframe and likely not going to happen, ever. > > The other tools, for better or worse are going to have to cope with this > lameness. Horrible... I think this will come back and bite us. I don't think it is reasonable to have other tools recognize and work around this pattern. I am sure we will see some brokeness because of this in the future. But for now I have submitted a patch for eu-elflint to recognize this as a GNU ld issue (when --gnu is given): https://lists.fedorahosted.org/pipermail/elfutils-devel/2015-August/005080.html Is there an upstream binutils ld bug I can refer from our elflint --gnu workaround page? https://fedorahosted.org/elfutils/wiki/ElflintGNU > And FWIW, this is a guarantee that we'll have zero's for the NOBITS section. > The static linker is responsible for zero-filling that area in the object > file so that the associated PT_LOAD segment can be efficiently mmap'd. I did add a check to elflint to at check the bytes are actually zero if the NOBITS section is in the middle of a PT_LOAD segment.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHEA-2015-2126.html