Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1251698 - elfutils testsuite fails because ld places NOBITS .plt in the middle of a PT_LOAD segment on ppc64
elfutils testsuite fails because ld places NOBITS .plt in the middle of a PT_...
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: elfutils (Show other bugs)
7.2
ppc64 Linux
unspecified Severity high
: rc
: 7.2
Assigned To: Mark Wielaard
Martin Cermak
:
Depends On: 1247126
Blocks: 1118366
  Show dependency treegraph
 
Reported: 2015-08-08 18:00 EDT by Michael Petlan
Modified: 2015-11-19 05:19 EST (History)
7 users (show)

See Also:
Fixed In Version: elfutils-0.163-2.el7
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-11-19 05:19:50 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
headers log (5.10 KB, text/plain)
2015-08-09 08:33 EDT, Michael Petlan
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
IBM Linux Technology Center 128903 None None None Never
Red Hat Product Errata RHEA-2015:2126 normal SHIPPED_LIVE elfutils bug fix and enhancement update 2015-11-19 04:54:56 EST

  None (edit)
Description Michael Petlan 2015-08-08 18:00:37 EDT
Description of problem:



==========================================
   elfutils 0.163: tests/test-suite.log
==========================================

# TOTAL: 139
# PASS:  135
# SKIP:  3
# XFAIL: 0
# FAIL:  1
# XPASS: 0
# ERROR: 0

.. contents:: :depth: 2

FAIL: run-elflint-self.sh
=========================

section [23] '.plt' has type NOBITS but is read from the file in segment of program header entry 3
*** failure in /root/rpmbuild/BUILD/elfutils-0.163/src/elflint --quiet --gnu-ld /root/rpmbuild/BUILD/elfutils-0.163/src/addr2line
section [24] '.plt' has type NOBITS but is read from the file in segment of program header entry 3
*** failure in /root/rpmbuild/BUILD/elfutils-0.163/src/elflint --quiet --gnu-ld /root/rpmbuild/BUILD/elfutils-0.163/src/elfcmp
section [24] '.plt' has type NOBITS but is read from the file in segment of program header entry 3
*** failure in /root/rpmbuild/BUILD/elfutils-0.163/src/elflint --quiet --gnu-ld /root/rpmbuild/BUILD/elfutils-0.163/src/elflint
section [24] '.plt' has type NOBITS but is read from the file in segment of program header entry 3
*** failure in /root/rpmbuild/BUILD/elfutils-0.163/src/elflint --quiet --gnu-ld /root/rpmbuild/BUILD/elfutils-0.163/src/nm
section [24] '.plt' has type NOBITS but is read from the file in segment of program header entry 3
*** failure in /root/rpmbuild/BUILD/elfutils-0.163/src/elflint --quiet --gnu-ld /root/rpmbuild/BUILD/elfutils-0.163/src/objdump
section [24] '.plt' has type NOBITS but is read from the file in segment of program header entry 3
*** failure in /root/rpmbuild/BUILD/elfutils-0.163/src/elflint --quiet --gnu-ld /root/rpmbuild/BUILD/elfutils-0.163/src/readelf
section [23] '.plt' has type NOBITS but is read from the file in segment of program header entry 1
*** failure in /root/rpmbuild/BUILD/elfutils-0.163/src/elflint --quiet --gnu-ld /root/rpmbuild/BUILD/elfutils-0.163/libelf/libelf.so
section [23] '.plt' has type NOBITS but is read from the file in segment of program header entry 1
*** failure in /root/rpmbuild/BUILD/elfutils-0.163/src/elflint --quiet --gnu-ld /root/rpmbuild/BUILD/elfutils-0.163/libdw/libdw.so
FAIL run-elflint-self.sh (exit status: 1)


Version-Release number of selected component (if applicable):
elfutils-0.163-1.el7.ppc64
gcc-4.8.5-4.el7.ppc64
binutils-2.23.52.0.1-54.el7.ppc64

How reproducible:


Steps to Reproduce:
1. build the SRPM
2. run the testsuite
3.

Actual results:

It fails.

Expected results:

The testsuite passes.


Additional info:

# /root/rpmbuild/BUILD/elfutils-0.163/src/elflint --quiet --gnu-ld /root/rpmbuild/BUILD/elfutils-0.163/src/addr2line
invalid machine flags: 0x1
section [ 9] '.rela.dyn': relocation 0: invalid type
section [ 9] '.rela.dyn': relocation 1: invalid type
section [10] '.rela.plt': relocation 0: invalid type
section [10] '.rela.plt': relocation 1: invalid type
section [10] '.rela.plt': relocation 2: invalid type
section [10] '.rela.plt': relocation 3: invalid type
section [10] '.rela.plt': relocation 4: invalid type
section [10] '.rela.plt': relocation 5: invalid type
section [10] '.rela.plt': relocation 6: invalid type
section [10] '.rela.plt': relocation 7: invalid type
section [10] '.rela.plt': relocation 8: invalid type
section [10] '.rela.plt': relocation 9: invalid type
section [10] '.rela.plt': relocation 10: invalid type
section [10] '.rela.plt': relocation 11: invalid type
section [10] '.rela.plt': relocation 12: invalid type
section [10] '.rela.plt': relocation 13: invalid type
section [10] '.rela.plt': relocation 14: invalid type
section [10] '.rela.plt': relocation 15: invalid type
section [10] '.rela.plt': relocation 16: invalid type
section [10] '.rela.plt': relocation 17: invalid type
section [10] '.rela.plt': relocation 18: invalid type
section [10] '.rela.plt': relocation 19: invalid type
section [10] '.rela.plt': relocation 20: invalid type
section [10] '.rela.plt': relocation 21: invalid type
section [10] '.rela.plt': relocation 22: invalid type
section [10] '.rela.plt': relocation 23: invalid type
section [10] '.rela.plt': relocation 24: invalid type
section [10] '.rela.plt': relocation 25: invalid type
section [10] '.rela.plt': relocation 26: invalid type
section [10] '.rela.plt': relocation 27: invalid type
section [10] '.rela.plt': relocation 28: invalid type
section [10] '.rela.plt': relocation 29: invalid type
section [10] '.rela.plt': relocation 30: invalid type
section [10] '.rela.plt': relocation 31: invalid type
section [10] '.rela.plt': relocation 32: invalid type
section [10] '.rela.plt': relocation 33: invalid type
section [10] '.rela.plt': relocation 34: invalid type
section [10] '.rela.plt': relocation 35: invalid type
section [10] '.rela.plt': relocation 36: invalid type
section [10] '.rela.plt': relocation 37: invalid type
section [10] '.rela.plt': relocation 38: invalid type
section [10] '.rela.plt': relocation 39: invalid type
section [10] '.rela.plt': relocation 40: invalid type
section [10] '.rela.plt': relocation 41: invalid type
section [10] '.rela.plt': relocation 42: invalid type
section [10] '.rela.plt': relocation 43: invalid type
section [10] '.rela.plt': relocation 44: invalid type
section [10] '.rela.plt': relocation 45: invalid type
section [10] '.rela.plt': relocation 46: invalid type
section [10] '.rela.plt': relocation 47: invalid type
section [10] '.rela.plt': relocation 48: invalid type
section [10] '.rela.plt': relocation 49: invalid type
section [10] '.rela.plt': relocation 50: invalid type
section [10] '.rela.plt': relocation 51: invalid type
section [10] '.rela.plt': relocation 52: invalid type
section [10] '.rela.plt': relocation 53: invalid type
section [10] '.rela.plt': relocation 54: invalid type
section [10] '.rela.plt': relocation 55: invalid type
section [10] '.rela.plt': relocation 56: invalid type
section [10] '.rela.plt': relocation 57: invalid type
section [10] '.rela.plt': relocation 58: invalid type
section [10] '.rela.plt': relocation 59: invalid type
section [10] '.rela.plt': relocation 60: invalid type
section [10] '.rela.plt': relocation 61: invalid type
section [10] '.rela.plt': relocation 62: invalid type
section [10] '.rela.plt': relocation 63: invalid type
section [10] '.rela.plt': relocation 64: invalid type
section [10] '.rela.plt': relocation 65: invalid type
section [20] '.dynamic': entry 20: unknown tag
section [23] '.plt' has wrong type: expected PROGBITS, is NOBITS
section [23] '.plt' has type NOBITS but is read from the file in segment of program header entry 3
section [36] '.symtab': symbol 36 does not fit completely in referenced section [21] '.opd'
section [36] '.symtab': symbol 38 does not fit completely in referenced section [21] '.opd'
section [36] '.symtab': symbol 39 does not fit completely in referenced section [21] '.opd'
section [36] '.symtab': symbol 40 does not fit completely in referenced section [21] '.opd'
section [36] '.symtab': symbol 41 does not fit completely in referenced section [21] '.opd'
section [36] '.symtab': symbol 42 does not fit completely in referenced section [21] '.opd'
section [36] '.symtab': symbol 43 does not fit completely in referenced section [21] '.opd'
section [36] '.symtab': symbol 44 does not fit completely in referenced section [21] '.opd'
section [36] '.symtab': symbol 186 does not fit completely in referenced section [21] '.opd'
section [36] '.symtab': symbol 231 does not fit completely in referenced section [21] '.opd'


It might be related to the recent binutils bug bz1247126.
Comment 2 Mark Wielaard 2015-08-09 04:30:19 EDT
I think you are right that this might be related to bug #1247126
But the testsuite is also ran when the package is build and must pass then (zero fail) or the build will abort. So it would be good to know if the package was build against a newer or older binutils.

The additional information shows something different I think.
It seems to not know about the relocation types, which indicates that it couldn't find the backend for this architecture. Maybe you could run it with:

LD_LIBRARY_PATH=/root/rpmbuild/BUILD/elfutils-0.163/backends:/root/rpmbuild/BUILD/elfutils-0.163/libelf:/root/rpmbuild/BUILD/elfutils-0.163/libdw /root/rpmbuild/BUILD/elfutils-0.163/src/elflint --quiet --gnu-ld /root/rpmbuild/BUILD/elfutils-0.163/src/addr2line
Comment 3 Michael Petlan 2015-08-09 07:20:41 EDT
Thanks for pointing to it. But even with the LD_LIBRARY_PATH I get an error:

# LD_LIBRARY_PATH=/root/rpmbuild/BUILD/elfutils-0.163/backends:/root/rpmbuild/BUILD/elfutils-0.163/libelf:/root/rpmbuild/BUILD/elfutils-0.163/libdw /root/rpmbuild/BUILD/elfutils-0.163/src/elflint --quiet --gnu-ld /root/rpmbuild/BUILD/elfutils-0.163/src/addr2line

section [23] '.plt' has type NOBITS but is read from the file in segment of program header entry 3

# echo $?

1
Comment 4 Michael Petlan 2015-08-09 07:22:32 EDT
I got this failure when I rebuilt elfutils with these installed:

elfutils-0.163-1.el7.ppc64
gcc-4.8.5-4.el7.ppc64
binutils-2.23.52.0.1-54.el7.ppc64
Comment 5 Mark Wielaard 2015-08-09 07:37:20 EDT
(In reply to Michael Petlan from comment #3)
> Thanks for pointing to it. But even with the LD_LIBRARY_PATH I get an error:
> 
> #
> LD_LIBRARY_PATH=/root/rpmbuild/BUILD/elfutils-0.163/backends:/root/rpmbuild/
> BUILD/elfutils-0.163/libelf:/root/rpmbuild/BUILD/elfutils-0.163/libdw
> /root/rpmbuild/BUILD/elfutils-0.163/src/elflint --quiet --gnu-ld
> /root/rpmbuild/BUILD/elfutils-0.163/src/addr2line
> 
> section [23] '.plt' has type NOBITS but is read from the file in segment of
> program header entry 3
> 
> # echo $?
> 
> 1

Thanks. Yes, makes sense this actual bug is flagged. Good to see the other issues are gone.

For reference could you print the program header of the file with:
 eu-readelf -l /root/rpmbuild/BUILD/elfutils-0.163/src/addr2line 

Just to verify eu-elflint is right.
Comment 6 Mark Wielaard 2015-08-09 07:45:53 EDT
(In reply to Michael Petlan from comment #4)
> I got this failure when I rebuilt elfutils with these installed:
> 
> elfutils-0.163-1.el7.ppc64
> gcc-4.8.5-4.el7.ppc64
> binutils-2.23.52.0.1-54.el7.ppc64

That is somewhat odd. The original binutils bug #1247126 was fixed in 
2.23.52.0.1-51 and elfutils was build against 2.23.52.0.1-41. I am not sure when exactly the bug was introduced. But I would have this issue to show up with an older binutils version instead of a newer one.

Lets see how the ELF program header looks (eu-readelf -l) and think about what the real root cause is.
Comment 7 Mark Wielaard 2015-08-09 08:20:31 EDT
O, please show also the section headers, so
eu-readelf -Sl /root/rpmbuild/BUILD/elfutils-0.163/src/addr2line
Comment 8 Michael Petlan 2015-08-09 08:33:27 EDT
Created attachment 1060736 [details]
headers log

Attached `eu-readelf -Sl /root/rpmbuild/BUILD/elfutils-0.163/src/addr2line` output.

I'll try to rebuild that with various other binutils versions and put here the results.
Comment 9 Michael Petlan 2015-08-09 09:03:12 EDT
I did some rebuilds with different binutils version installed.
It seems that the problem has been indroduced by the fix in the 51.el7 release.


binutils-2.23.52.0.1-50.el7.ppc64.rpm PASS
binutils-2.23.52.0.1-51.el7.ppc64.rpm FAIL
binutils-2.23.52.0.1-54.el7.ppc64.rpm FAIL
Comment 10 Mark Wielaard 2015-08-09 10:43:16 EDT
I am adding Jeff Law to the CC, who might know whether this is a "real issue" where binutils generates something buggy. Or whether it is a "sanity issue" which eu-elflint detects, but is in fact harmless.

(In reply to Michael Petlan from comment #8)
> Created attachment 1060736 [details]
> headers log
> 
> Attached `eu-readelf -Sl /root/rpmbuild/BUILD/elfutils-0.163/src/addr2line`
> output.

Thanks. So we have the following sections:

[17] .init_array          INIT_ARRAY   000000001001fc60 0000fc60 00000008  0 WA     0   0  8
[18] .fini_array          FINI_ARRAY   000000001001fc68 0000fc68 00000008  0 WA     0   0  8
[19] .jcr                 PROGBITS     000000001001fc70 0000fc70 00000008  0 WA     0   0  8
[20] .dynamic             DYNAMIC      000000001001fc78 0000fc78 00000210 16 WA     6   0  8
[21] .opd                 PROGBITS     000000001001fe88 0000fe88 00000138  0 WA     0   0  8
[22] .got                 PROGBITS     000000001001ffc0 0000ffc0 00000040  8 WA     0   0  8
[23] .plt                 NOBITS       0000000010020000 00010000 00000648 24 WA     0   0  8
[24] .data                PROGBITS     0000000010020648 00010648 00000004  0 WA     0   0  1
[25] .bss                 NOBITS       0000000010020650 0001064c 00000078  0 WA     0   0  8

(And after the .bss there is more data in the file [in fact the next section starts at the same file offset as .bss] but that isn't allocated, so has an address of zero.)

And we have the following PT_LOAD covering those sections (data in the file):

  LOAD           0x00fc60 0x000000001001fc60 0x000000001001fc60 0x0009ec 0x000a68 RW  0x10000

What eu-elflint is complaining about is the fact that the .plt section is NOBITS and so doesn't have any file contents, but the PT_LOAD segment is pulling in bytes from the file at that sections offset anyway.

Note that this is different from the other NOBITS section covered, the .bss section. That one isn't actually covered by the p_filesz, only by the p_memsz, so no contents is actually read from that section on disk.

I think eu-elflint is correct to warn about this because since the section at the .plt section file location is NOBITS there is no guarantee there is anything at that file location, and if there is there is no guarantee that the file contents is actually all zero (which NOBITS implies should be loaded at the sh_addr into memory).

This might not be a bug in practice since you can see there is a hole in the ELF file after the .plt section. The next .data section seems to start precisely after a hole in the ELF file. If the hole is filled with zeros then everything will probably work out. But there is no guarantee that the hole between the sections is filled with zeros.

Jeff, do you know whether or not this is a real bug, or that the expectation is that there always should be a hole filled with zeros after such a NOBITS section that is big enough to make sure the PT_LOAD maps in zeros for such a section? If so, then why isn't the .plt section just a normal PROGBITS section (that happens to be filled with zeros)?

If this isn't considered a bug in binutils and ld.so handles it fine, then I could add extra checks to eu-elflint to see if a NOBITS section is covered by a PT_LOAD then there is a hole big enough to the next section and that the file contents of that hole is actually zero. But I think the more correct solution would be to mark the .plt section as PROGBITS if it is actually backed up with actual file contents.
Comment 11 Mark Wielaard 2015-08-09 10:50:40 EDT
(In reply to Mark Wielaard from comment #10)
> But I think the more
> correct solution would be to mark the .plt section as PROGBITS if it is
> actually backed up with actual file contents.

Or, to group the .plt and .bss sections together at the end of the segment.

(I cannot really tell whether a NOBITS section makes any sense unless
it is at the end of a segment.)
Comment 12 Jeff Law 2015-08-10 11:41:00 EDT
We've been 'round and 'round on the NOBITS .plt section appearing in the middle of a PT_LOAD segment, followed by PROGBITS sections.  Unfortunately, it appears like that behaviour is here to stay due to requirement of the PPC ABI.

We have carefully reviewed the behaviour of the static linker to ensure that it zero-fills that NOBITS hunk in the generated file, which in turn allows the dynamic linker to continue to mmap the covering PT_LOAD segment as a single unit.

In essence the static linker treats it as a PROGBITS section under the hood, even that wasn't enough to convince the powers that be to change the wording in the ABI.  

We also discussed potential issues with low level applications such as elfutils, gdb, valgrind, prelink which do not expect a NOBITS section in the middle of a PT_LOAD segment.

So I think this oddity is here to stay.
Comment 13 Mark Wielaard 2015-08-10 12:26:06 EDT
(In reply to Jeff Law from comment #12)
> We've been 'round and 'round on the NOBITS .plt section appearing in the
> middle of a PT_LOAD segment, followed by PROGBITS sections.  Unfortunately,
> it appears like that behaviour is here to stay due to requirement of the PPC
> ABI.

Does it really require that? That seems completely broken and I cannot see how that can be allowed by the ELF/gabi spec.

> We have carefully reviewed the behaviour of the static linker to ensure that
> it zero-fills that NOBITS hunk in the generated file, which in turn allows
> the dynamic linker to continue to mmap the covering PT_LOAD segment as a
> single unit.
> 
> In essence the static linker treats it as a PROGBITS section under the hood,
> even that wasn't enough to convince the powers that be to change the wording
> in the ABI.  

The problem is that it isn't actually a PROGBITS sections. It is a NOBITS section. So there is no guarantee that there are actually any bits in the file, or if there are any bits that all those bits will be zero. Anything that postprocesses such a file, objcopy, prelink, eu-strip, etc. might put something else there because there is just a gap in the file that looks like it is unused.

IMHO the static linker is doing this wrong. If it is an requirement that the .plt section really is NOBITS (gabi says it should be PROGBITS), then it should either group it together with the .bss section at the end of the segment (so there is no gap in the file and the segment only uses defined bits), or if the sections are not grouped together it should create separate PT_LOAD segments with the .plt and the .bss are the end of those segments (again so that no undefined bits of the file are used).

I could add a workaround (guarded by --gnu) to eu-elflint to recognize a such a NOBITS section in the middle of a PT_LOAD segment that really is a PROGBITS section because it has a gap in the file that happen to be filled with zeros. But I think that is just papering over the real issue. Anything that might manipulate such an ELF file will most likely cause that gap in the ELF file to disappear. Is there really no way to either fix the ABI or the static linker to not create such a dangerous situation?
Comment 14 Mark Wielaard 2015-08-10 18:41:09 EDT
I do think this is a bug in the binutils static linker. So lets first see whether they can fix it before adding any extra workarounds to elfutils eu-elflint for it.
Comment 15 Jeff Law 2015-08-10 22:43:11 EDT
Mark,

I understand your concerns and conclusions and like Jakub, I'd really prefer to see this change, but that's certainly not going to happen in the RHEL 7.2 timeframe and likely not going to happen, ever.

The other tools, for better or worse are going to have to cope with this lameness.


And FWIW, this is a guarantee that we'll have zero's for the NOBITS section.  The static linker is responsible for zero-filling that area in the object file so that the associated PT_LOAD segment can be efficiently mmap'd.
Comment 16 Mark Wielaard 2015-08-11 18:25:48 EDT
(In reply to Jeff Law from comment #15)
> I understand your concerns and conclusions and like Jakub, I'd really prefer
> to see this change, but that's certainly not going to happen in the RHEL 7.2
> timeframe and likely not going to happen, ever.
> 
> The other tools, for better or worse are going to have to cope with this
> lameness.

Horrible... I think this will come back and bite us. I don't think it is reasonable to have other tools recognize and work around this pattern. I am sure we will see some brokeness because of this in the future.

But for now I have submitted a patch for eu-elflint to recognize this
as a GNU ld issue (when --gnu is given):
https://lists.fedorahosted.org/pipermail/elfutils-devel/2015-August/005080.html

Is there an upstream binutils ld bug I can refer from our elflint --gnu workaround page?
https://fedorahosted.org/elfutils/wiki/ElflintGNU

> And FWIW, this is a guarantee that we'll have zero's for the NOBITS section.
> The static linker is responsible for zero-filling that area in the object
> file so that the associated PT_LOAD segment can be efficiently mmap'd.

I did add a check to elflint to at check the bytes are actually zero if the NOBITS section is in the middle of a PT_LOAD segment.
Comment 20 errata-xmlrpc 2015-11-19 05:19:50 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2015-2126.html

Note You need to log in before you can comment on or make changes to this bug.