Bug 109790

Summary: broken pthread_exit() in NPTL static
Product: Red Hat Enterprise Linux 3 Reporter: Stephane Eranian <stephane.eranian>
Component: glibcAssignee: Jakub Jelinek <jakub>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: medium    
Version: 3.0   
Target Milestone: ---   
Target Release: ---   
Hardware: ia64   
OS: Linux   
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-05-11 21:28:22 EDT Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Bug Depends On:    
Bug Blocks: 107563    

Description Stephane Eranian 2003-11-11 15:03:37 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3) Gecko/20030314

Description of problem:

There is a serious problem with the NPTL pthread library when linking
static with RHEL3.0 release for IA64. Any call to pthread_exit() 
will end up killing the program with SIGILL due to infinite recursion.

We verified that your latest version (2.3.2-95.3) and also the glibc 
CVS tree have the problem. Strangely enough the problem does not show 
up when using the shared version of the library. It is pretty obvious 
once you lookat the assembly code:

pthread_exit.o:     file format elf64-ia64-little

Disassembly of section .text:

0000000000000000 <__pthread_exit>:
   0:   [MMB]       alloc r34=ar.pfs,4,3,0
   6:               adds r18=-1680,r13
   c:               nop.b 0x0
  10:   [MII]       adds r2=-688,r13
  16:               mov r33=b0;;
  1c:               adds r17=168,r18
  20:   [MMI]       st8 [r2]=r32;;
  26:               ld4 r14=[r17]
  2c:               nop.i 0x0;;
  30:   [MII]       nop.m 0x0
  36:               zxt4 r8=r14
  3c:               mov r2=r14
  40:   [MMI]       mf;;
  46:               mov.m ar.ccv=r8
  4c:               nop.i 0x0
  50:   [MMI]       or r3=16,r2;;
  56:               cmpxchg4.acq r14=[r17],r3,ar.ccv
  5c:               nop.i 0x0;;
  60:   [MIB]       nop.m 0x0
  66:               cmp4.eq p9,p8=r2,r14
  6c:         (p08) br.cond.dpnt.few 30 <__pthread_exit+0x30>
  70:   [MII]       nop.m 0x0
  76:               adds r9=160,r18;;
  7c:               nop.i 0x0
  80:   [MFB]       ld8 r35=[r9]
                        82: PCREL21B    __pthread_unwind
  86:               nop.f 0x0
  8c:               br.call.sptk.many b0=80 <__pthread_exit+0x80>;;
  90:   [MFB]       nop.m 0x0
  96:               break.f 0x0
  9c:               nop.b 0x0;;

Somehow __pthread_unwind() is not defined in the static version of the
library. But because __pthread_unwind() is defined as a WEAK symbol
the linker does not warn you about the undefined symbol, instead is
uses 0 for the offset to use in the following br.call (which is an
IP-relative branch on IA64). This generates a br.call to the same
bundle and leads to infinite recursion. Because of br.call, we end up
in a register backing store overflow which generates a SIGILL. The
generated code looks as follows:

4000000000042160 <__pthread_exit>:
4000000000042160:       [MMB]       alloc r34=ar.pfs,4,3,0
4000000000042166:                   adds r18=-1680,r13
400000000004216c:                   nop.b 0x0
4000000000042170:       [MII]       adds r2=-688,r13
4000000000042176:                   mov r33=b0;;
400000000004217c:                   adds r17=168,r18
4000000000042180:       [MMI]       st8 [r2]=r32;;
4000000000042186:                   ld4 r14=[r17]
400000000004218c:                   nop.i 0x0;;
4000000000042190:       [MII]       nop.m 0x0
4000000000042196:                   zxt4 r8=r14
400000000004219c:                   mov r2=r14
40000000000421a0:       [MMI]       mf;;
40000000000421a6:                   mov.m ar.ccv=r8
40000000000421ac:                   nop.i 0x0
40000000000421b0:       [MMI]       or r3=16,r2;;
40000000000421b6:                   cmpxchg4.acq r14=[r17],r3,ar.ccv
40000000000421bc:                   nop.i 0x0;;
40000000000421c0:       [MIB]       nop.m 0x0
40000000000421c6:                   cmp4.eq p9,p8=r2,r14
           (p08) br.cond.dpnt.few 4000000000042190 <__pthread_exit>
40000000000421d0:       [MII]       nop.m 0x0
40000000000421d6:                   adds r9=160,r18;;
40000000000421dc:                   nop.i 0x0
/* infinite recusion  */
40000000000421e0:       [MFB]       ld8 r35=[r9]
40000000000421e6:                   nop.f 0x0
                 br.call.sptk.manyb0=40000000000421e0 <__pthread_exit>

40000000000421f0:       [MFB]       nop.m 0x0
40000000000421f6:                   break.f 0x0
40000000000421fc:                   nop.b 0x0;;

What I don't understand is why is pthread_unwind() not defined in the
static version of the library?

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1.any programs that uses pthread_exit() and that is linked static
against NPTL:
   cc -O2 -static test.c -I/usr/include/nptl -o test -L/usr/lib/nptl

Actual Results:  program dies with SIGILL

Expected Results:  normal termination of the program

Additional info:
Comment 1 Jakub Jelinek 2004-02-29 12:53:01 EST
Should be fixed in nptl-devel-2.3.2-95.7 and above (for U2 there is
ATM nptl-devel-2.3.2-95.13).
Comment 2 John Flanagan 2004-05-11 21:28:22 EDT
An errata has been issued which should help the problem described in this bug report. 
This report is therefore being closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files, please follow the link below. You may reopen 
this bug report if the solution does not work for you.