Bug 129356

Summary: ppc64 -fPIE compiled, -pie linked stripped executables get corrupt stacks
Product: Red Hat Enterprise Linux 3 Reporter: Jason Vas Dias <jvdias>
Component: binutilsAssignee: Jakub Jelinek <jakub>
Status: CLOSED WORKSFORME QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 3.0   
Target Milestone: ---   
Target Release: ---   
Hardware: powerpc   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-10-08 08:41:59 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jason Vas Dias 2004-08-06 20:06:46 UTC
Description of problem:

This bug was found to be the root cause of bug 127420 ,
where the samba package's /usr/bin/smbmnt program 
generated SIGBUS on every call, which was fixed by not
compiling the executable with -fPIE / linking with -pie.

This appears only to happen on ppc64 (IBM pSeries) platforms.

In essence, smbmnt was doing something very similar to this
demonstration program with which the problem can be duplicated:

test_mount.c:

#include <sys/types.h>
#include <unistd.h>
#include <sys/mount.h>

int main(int argc, char **argv, char **envp)
{
    char mnt_opts[1024]="\0";

    return  mount( "//bogus/none", ".", "smbfs",
MS_NOSUID|MS_NODEV|0xc0ed0000,  &mnt_opts  );
}

When the above code is compiled and run like this:
  $ gcc -fPIE -c test_mount.c
  $ gcc -pie -o test_mount test_mount.o
  $ ./test_mount

It runs with no problems. But if it is stripped :

  $ strip test_mount
  $ ./test_mount
  Bus error

Kernel generates a SIGBUS.

So compiling with -fPIE, linking with -pie, and stripping
causes the kernel to think the address of the 'mnt_opts' 
string on the stack is an invalid address.

When examining the samba smbmnt corefile produced by the 
SIGBUS, with the /usr/lib/debug/usr/bin/smbmnt.debug
file installed (which should make symbol information 
available to gdb, gdb is unable to resolve any symbols 
and complains about a "Corrupt Stack" .
 
Glen Johnson of IBM did some investigation of this issue
for bug 127420 and found some information that might be useful:

 ------- Additional Comment #85 From Glen Johnson
(gjohnson.com)  on 2004-08-06 02:26 :

I can't see anything wrong with the stripped smbmnt.  Differences are:
a) .symtab and .strtab are gone;  That's expected.  Just strip doing
its job.
b) .shstrtab is 16 bytes smaller;  The length of the above removed
section name
strings.
c) Section headers moved 16 bytes in file, due to (b).
d) Program header 32 bytes smaller, because ld allocates some spares
that aren't
needed.

All section contents are identical.

To see whether there was something funny with the mount call, I
replaced the "bl
mount" (call to glibc's mount stub) with 4 bytes of zeros to generate
a sigill,
giving me a core dump showing registers before the mount.  They look
virtually
identical, at least in that all params of mount, r4 to r7, are the
same.  (I
checked that r7, though a different pointer value, actually points to
the same
data string.)  Here's the reg dump:

Contents of section .reg/8680:  (not stripped)
 0000 00000005 ffffc2f0 40052860 ffffeb9c  ........@.(`....
 0010 40002050 4000208c c0ed0006 ffffc2f8  @. P@. .........
 0020 40002089 3fee47b0 40002088 00000000  @. .?.G.@. .....
 0030 84000482 4001a37c ffffc918 00000000  ....@..|........
 0040 3ffed140 00000000 00000000 00000000  ?..@............
 0050 00000000 00000000 00000000 00000000  ................
 0060 00000000 00000000 c0ed0006 ffffeb9c  ................
 0070 ffffc8b8 ffffc2f8 4001a1ec ffffc8b8  ........@.......
 0080 40001600 0008f932 ffffc6f8 3ff1063c  @......2....?..<
 0090 400015ec 20000000 44000482 00000000  @... ...D.......
 00a0 00000700 3ffc83c8 40000000 00000000  ....?...@.......
 00b0 00000000 00000000 00000000 00000000  ................

Contents of section .reg/8681:   (stripped)
 0000 00000005 ffffd470 40052860 ffffeb9c  .......p@.(`....
 0010 40002050 4000208c c0ed0006 ffffd478  @. P@. ........x
 0020 40002089 3fee47b0 40002088 00000000  @. .?.G.@. .....
 0030 84000482 4001a37c ffffda98 00000000  ....@..|........
 0040 3ffed140 00000000 00000000 00000000  ?..@............
 0050 00000000 00000000 00000000 00000000  ................
 0060 00000000 00000000 c0ed0006 ffffeb9c  ................
 0070 ffffda38 ffffd478 4001a1ec ffffda38  ...8...x@......8
 0080 40001600 0008f932 ffffd878 3ff1063c  @......2...x?..<
 0090 400015ec 20000000 44000482 00000000  @... ...D.......
 00a0 00000700 3ffc83c8 40000000 00000000  ....?...@.......
 00b0 00000000 00000000 00000000 00000000  ................

It's curious that r1, the stack pointer, is different.  It is
something that
happens before we get to _start in smbmnt, possibly something in
ld.so.  If you
follow the backchain links, you get:

(not stripped)
ffffc2f0: ffffc8b0 400015ec (smbmnt:do_mount)
ffffc8b0: ffffc910 40001874 (smbmnt:main)
ffffc910: ffffcb10 3feb6068 weird lr save here??
ffffcb10: ffffcb30 3feb5eac
ffffcb30: 00000000 00000000

(stripped)
ffffd470: ffffda30 400015ec
ffffda30: ffffda90 40001874
ffffda90: ffffdc90 3feb6068
ffffdc90: ffffdcb0 3feb5eac
ffffdcb0: 00000000 00000000

Another weird clue:  If I run the stripped version like

  /lib/ld-2.3.2.so ./smbmnt-stripped tmp -s //ltp/tmpdir

it works! 
 --------

I tried that too:

$ /lib/ld-2.3.2.so ./test_mount
(OK, no SIGBUS!)
$ ./test_mount
Bus error










Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Jay Turner 2004-08-21 01:06:48 UTC
FYI, I'm able to replicate this on pseries.lab.boston.redhat.com
running 2.4.21-20.EL and binutils-2.14.90.0.4-35.  Jason did some
really good legwork which shows that the problem is with binaries compiled
 with -fPIE, linked with -pie, and then stripped.  Is there any good
way to determine if there are other binaries in the U3 distro which
might exhibit this issue so that we can work around them as was done
with samba?


Comment 2 Jakub Jelinek 2004-10-08 08:41:59 UTC
I really can't reproduce this.
#!/bin/sh
cat > test_mount.c <<EOF
#include <sys/types.h>
#include <unistd.h>
#include <sys/mount.h>

int main(int argc, char **argv, char **envp)
{
    char mnt_opts[1024]="\0";

    return  mount( "//bogus/none", ".", "smbfs", MS_NOSUID|MS_NODEV|0xc0ed0000,  &mnt_opts  );
}
EOF
gcc -fPIE -c test_mount.c
for i in not_stripped stripped eu-stripped; do gcc -pie -o test_mount.$i test_mount.o; done
strip test_mount.stripped
eu-strip -f test_mount.debug test_mount.eu-stripped
for i in not_stripped stripped eu-stripped; do ./test_mount.$i; echo $?; done
md5sum test_mount.{not_stripped,stripped,eu-stripped,debug}
rpm -q gcc binutils glibc

255
255
255
7e37b7477dd72077858b1285b030e340  test_mount.not_stripped
6f184ecf01439aefb6036dc43f2a9503  test_mount.stripped
7f5720c205a3ef71b2ba9b1f735be8de  test_mount.eu-stripped
ae1fc8f5791bb68626a9d22a5b6e4228  test_mount.debug
gcc-3.2.3-46
binutils-2.14.90.0.4-35
glibc-2.3.2-95.28

Comment 3 Jason Vas Dias 2004-10-08 15:20:42 UTC
Yes, it looks like the latest versions of binutils / gcc / glibc
fixed this problem:

First, I reproduced it with these versions:
$ rpm -q gcc glibc binutils
gcc-3.2.3-20
glibc-2.3.2-95.2
glibc-2.3.2-95.2
binutils-2.14.90.0.4-25
$ gcc -fPIE -c test_mount.c
$  gcc -pie -o test_mount test_mount.o
$ strip ./test_mount
$ ./test_mount
Bus error

Then, I upgraded to these versions:
$ rpm -q gcc glibc binutils
gcc-3.2.3-46
glibc-2.3.2-95.2
glibc-2.3.2-95.28
binutils-2.14.90.0.4-35
$ gcc -fPIE -c test_mount.c
$  gcc -pie -o test_mount test_mount.o
$ strip ./test_mount
$ ./test_mount
$
NO BUS ERROR!