Bug 685087

Summary: Bad code gen in function prolog; stacked frame is smaller than ABI minimum
Product: Red Hat Enterprise Linux 5 Reporter: IBM Bug Proxy <bugproxy>
Component: gccAssignee: Jakub Jelinek <jakub>
Status: CLOSED DUPLICATE QA Contact: qe-baseos-tools-bugs
Severity: high Docs Contact:
Priority: unspecified    
Version: 5.6   
Target Milestone: rc   
Target Release: ---   
Hardware: ppc64   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-03-28 12:29:27 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description IBM Bug Proxy 2011-03-15 09:30:36 UTC
---Problem Description---
Function numa_set_strict in /usr/lib64/libnuma.so.1 (part of numactl-2.0.3-9.el6.ppc64) looks like
it has bad code generated for the function prolog.  The stdu instruction to create the stack frame
is only moving the stack pointer 48 bytes, which is not enough to conform to the ABI.  (The ABI
requires room be left for parameters to be saved; the minimum size is 112 bytes.)  In the process of
trying to resolve a variable in thread local storage the stack gets clobbered because the required
parm save area is not present, and the app dies with a segfault.

---uname output---
Linux eagledw1.austin.ibm.com 2.6.32.16 #1 SMP Fri Jul 9 13:42:06 EDT 2010 ppc64 ppc64 ppc64 GNU/Linux

Machine Type = ps702 (Power 7 blade) 

---Steps to Reproduce---

objdump -D /usr/lib64/libnuma.so.1 and look at the stdu instruction in the prolog of
numa_set_strict.  You should see this:

0000008042222dd0 <.numa_set_strict>:
  8042222dd0:   7c 08 02 a6     mflr    r0
  8042222dd4:   2f a3 00 00     cmpdi   cr7,r3,0
  8042222dd8:   38 62 80 20     addi    r3,r2,-32736
  8042222ddc:   f8 01 00 10     std     r0,16(r1)
  8042222de0:   f8 21 ff d1     stdu    r1,-48(r1)
  8042222de4:   41 9e 00 3c     beq-    cr7,8042222e20 <.numa_set_strict+0x50>
  8042222de8:   4b ff f9 75     bl      804222275c <._init+0x20c>
  8042222dec:   e8 41 00 28     ld      r2,40(r1)

That stack frame is too small.


---GCC - Power Component Data--- 
Userspace tool common name: libnuma 
The userspace tool has the following bit modes: 64; 32 bit not tested (yet) 
Userspace rpm: numactl-0.9.8-11.el5 

This bug report is basically the same as Case 00404759, except it is in RHEL 5.5 instead of RHEL 6.

Here is the overview.

The gcc compiler is generating some stack frames that are too small and do not comply with the
PowerPC ABI.  Libnuma (part of numactl) has a function called numa_set_strict which is generated
with one of these too small stack frames.  When it calls out of line to resolve a piece of thread
local storage the resolver code tries to save parameters in a save area that was not created,
corrupting the stack, leading to a seg fault.  (The minimum sized stack frame is 112 bytes, and that
leaves room for a parameter save area.  In this case the stack frame is only 48 bytes in size.)

"Steve's comment about me changing _dl_runtime_resolve to use the correct param save area applies to
a patch I sent internally for testing within IBM on May 2, mainly aimed at fixing automatic multiple
toc problems in static glibc.  It
seems this patch made it into one of IBM's advanced toolchain glibc releases, and so exposed people
to the gcc stack frame bug.  Someone (Peter?) notified me of the gcc problem later in May, and I
sent a revised glibc patch internally on May 25.  

That version is the one posted

http://sourceware.org/ml/libc-alpha/2010-08/msg00006.html and is now committed in upstream glibc.  

The gcc bug was also fixed
http://gcc.gnu.org/ml/gcc-patches/2010-05/msg01933.html

While I agree with Michael Brutman that not setting up a proper stack frame is a serious gcc bug, I
think you'll need that particular AT glibc to ever see a problem.  If I'm wrong about that, and
there is some other way to get segfaults due to the gcc bug, please correct me!  I know for sure
that no version of gcc currently available generates code that would make upstream glibc's _dl_fixup
function, called from _dl_runtime_resolve, trash the (incorrect) save location used by older
upstream glibc _dl_runtime_resolve."


The patch referenced above is supposed to be the fix.

Comment 1 Anton Arapov 2011-03-24 13:33:00 UTC
 Why did you report this against numactl then? It's GCC thing, no?

Comment 2 IBM Bug Proxy 2011-03-24 14:21:16 UTC
------- Comment From brutman.com 2011-03-24 10:16 EDT-------
The component in the bz header is listed as GCC - Power.  Are you seeing numactl somewhere else?

If so, it's wrong - it's clearly a GCC bug.  Numactl just was lucky that day.

Comment 3 Anton Arapov 2011-03-25 09:17:36 UTC
moving to the right component.

Comment 4 Jakub Jelinek 2011-03-28 12:29:27 UTC

*** This bug has been marked as a duplicate of bug 624889 ***