Bug 1467518

Summary: glibc: Invalid IFUNC resolver from libgcc calls getauxval, leading to ppc64le relocation crash
Product: [Fedora] Fedora Reporter: Florian Weimer <fweimer>
Component: glibcAssignee: Florian Weimer <fweimer>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: arjun.is, codonell, dj, fweimer, law, mat.booth, mfabian, pfrankli, siddhesh, tulioqm
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glibc-2.25.90-26.fc27 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1467526 (view as bug list) Environment:
Last Closed: 2017-08-04 20:56:11 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1467526    
Bug Blocks: 1466116, 1469022, 1470013    

Description Florian Weimer 2017-07-04 06:22:37 UTC
Upstream glibc master started linking in have_ieee_hw_p from libgcc on ppc64le.  This leads to a crash on the last line because getauxval uses data which has not been initialized yet at this point.  The crash is at the last line of the disassembly.

00000000001c3380 <have_ieee_hw_p>:
  1c3380:       08 00 4c 3c     addis   r2,r12,8
  1c3384:       80 3d 42 38     addi    r2,r2,15744
  1c3388:       f8 ff e1 fb     std     r31,-8(r1)
  1c338c:       a0 8c e2 eb     ld      r31,-29536(r2)
  1c3390:       d1 ff 21 f8     stdu    r1,-48(r1)
  1c3394:       02 00 3f e9     lwa     r9,0(r31)
  1c3398:       00 00 89 2f     cmpwi   cr7,r9,0
  1c339c:       14 00 9c 41     blt     cr7,1c33b0 <have_ieee_hw_p+0x30>
  1c33a0:       30 00 21 38     addi    r1,r1,48
  1c33a4:       78 4b 23 7d     mr      r3,r9
  1c33a8:       f8 ff e1 eb     ld      r31,-8(r1)
  1c33ac:       20 00 80 4e     blr
  1c33b0:       a6 02 08 7c     mflr    r0
  1c33b4:       0f 00 60 38     li      r3,15
  1c33b8:       40 00 01 f8     std     r0,64(r1)
  1c33bc:       15 fc e5 4b     bl      22fd0 <00000036.plt_call.__getauxval>
  1c33c0:       18 00 41 e8     ld      r2,24(r1)

So far, this happens only with --enable-bind-now builds.  I'll disable that on ppc64le as an immediate workaround, but we'll need an upstream fix for this (in glibc or GCC).

Comment 1 Florian Weimer 2017-07-07 10:06:12 UTC
*** Bug 1467833 has been marked as a duplicate of this bug. ***

Comment 2 Florian Weimer 2017-07-07 10:07:11 UTC
The --enable-bind-now workaround turned out to be insufficient.  We need the upstream fix.

Comment 3 Carlos O'Donell 2017-07-07 20:28:28 UTC
Status so far is that the fix will be in gcc by removing getauxval call from libgcc which is a violation of IFUNC resolver rules which say you must not call functions from other translation units.

We are working with the gcc team sort the issue out before the mass rebuild.

Comment 4 Carlos O'Donell 2017-07-13 14:18:10 UTC
There is a temporary fix in place for Rawhide.

Upstream fix v2 posted by IBM and reviewed last night:
https://sourceware.org/ml/libc-alpha/2017-07/msg00526.html

Comment 5 Florian Weimer 2017-07-17 21:34:53 UTC
Upstream fix incorporated into glibc-2.25.90-26.fc27.

Comment 6 Carlos O'Donell 2017-07-17 23:27:30 UTC
I've just reviewed v4 from IBM today:
https://sourceware.org/ml/libc-alpha/2017-07/msg00599.html

Looks done, and we'll include that out via a sync and then close this bug out.