1790475 – glibc: BIND_NOW is incompatible with copy relocations on ppc64

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1790475 - glibc: BIND_NOW is incompatible with copy relocations on ppc64

Summary: glibc: BIND_NOW is incompatible with copy relocations on ppc64

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	glibc
Sub Component:
Version:	7.8
Hardware:	ppc64
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	rc
Target Release:	---
Assignee:	glibc team
QA Contact:	qe-baseos-tools-bugs
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1793853
TreeView+	depends on / blocked

Reported:	2020-01-13 12:58 UTC by Kamil Dudka
Modified:	2020-10-05 08:28 UTC (History)
CC List:	14 users (show)
Fixed In Version:	glibc-2.17-307.el7.1
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Clones:	1793853 (view as bug list)
Environment:
Last Closed:	2020-03-31 19:08:32 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
rhbz1790475-reproducer.cc (246 bytes, text/plain) 2020-01-13 13:00 UTC, Kamil Dudka	no flags	Details
Adjust-security-hardening-changes-for-64-bit-POWER-BE.patch (1.76 KB, patch) 2020-01-14 03:32 UTC, Carlos O'Donell	no flags	Details \| Diff
View All

Links
System	ID	Priority	Status	Summary	Last Updated
Red Hat Bugzilla	1791321	unspecified	CLOSED	gcc: weakref attribute introduces the need for copy or text relocations on ppc64	2021-02-22 00:41:40 UTC
Red Hat Product Errata	RHBA-2020:0989	None	None	None	2020-03-31 19:08:51 UTC
Sourceware	25384	P2	RESOLVED	Copy relocations and BIND_NOW on POWER ELFv1 results in crashes	2020-04-28 19:22:28 UTC

Internal Links: 1791321

Description Kamil Dudka 2020-01-13 12:58:20 UTC

Description of problem:
As originally reported by Daniel Rusek, C++ applications using libcurl crash at startup on ppc64.  The underlying cause is that PR_Init() crashes on ppc64 if it is invoked from C++ run-time.  The attached reproducer only crashes if it also uses std::string in its source code despite its instantiation is not reachable from main().


Version-Release number of selected component (if applicable):
nspr-4.21.0-1.el7.ppc64


Steps to Reproduce:
1. run the attached reproducer


Actual results:
The program terminates abnormally on SIGSEGV.


Expected results:
The program terminates successfully.


Additional info:
See the output of valgrind.

Comment 3 Kamil Dudka 2020-01-13 13:00:47 UTC

Created attachment 1651859 [details]
rhbz1790475-reproducer.cc

Comment 4 Kamil Dudka 2020-01-13 13:02:06 UTC

# curl -JO 'https://bugzilla.redhat.com/attachment.cgi?id=1651859'
# bash -x rhbz1790475-reproducer.cc
++ pkg-config nspr --cflags --libs
+ g++ rhbz1790475-reproducer.cc -I/usr/include/nspr4 -lplds4 -lplc4 -lnspr4 -lpthread -ldl -O0 -ggdb
+ ./a.out
rhbz1790475-reproducer.cc: line 3: 22238 Segmentation fault      ./a.out
+ exit 139

# valgrind ./a.out 
==22239== Memcheck, a memory error detector
==22239== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==22239== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==22239== Command: ./a.out
==22239== 
==22239== Jump to the invalid address stated on the next line
==22239==    at 0x0: ???
==22239==    by 0x41B9E47: __pthread_once_slow (pthread_once.c:117)
==22239==    by 0x42021BB: _dlerror_run (dlerror.c:129)
==22239==    by 0x42018AF: dlopen@@GLIBC_2.3 (dlopen.c:87)
==22239==    by 0x415289B: pr_FindSymbolInProg (prmem.c:98)
==22239==    by 0x415289B: _PR_InitZones (prmem.c:154)
==22239==    by 0x41595A7: _PR_InitStuff (prinit.c:144)
==22239==    by 0x10000987: main (rhbz1790475-reproducer.cc:17)
==22239==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
==22239== 
==22239== 
==22239== Process terminating with default action of signal 11 (SIGSEGV)
==22239==  Bad permissions for mapped region at address 0x0
==22239==    at 0x0: ???
==22239==    by 0x41B9E47: __pthread_once_slow (pthread_once.c:117)
==22239==    by 0x42021BB: _dlerror_run (dlerror.c:129)
==22239==    by 0x42018AF: dlopen@@GLIBC_2.3 (dlopen.c:87)
==22239==    by 0x415289B: pr_FindSymbolInProg (prmem.c:98)
==22239==    by 0x415289B: _PR_InitZones (prmem.c:154)
==22239==    by 0x41595A7: _PR_InitStuff (prinit.c:144)
==22239==    by 0x10000987: main (rhbz1790475-reproducer.cc:17)
==22239== 
==22239== HEAP SUMMARY:
==22239==     in use at exit: 0 bytes in 0 blocks
==22239==   total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==22239== 
==22239== All heap blocks were freed -- no leaks are possible
==22239== 
==22239== For lists of detected and suppressed errors, rerun with: -s
==22239== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
Segmentation fault

Comment 5 Kamil Dudka 2020-01-13 13:47:10 UTC

This seems to be caused by:

    glibc-2.17-307.el7.ppc64

After downgrade to:

    glibc-2.17-292.el7.ppc64

... the reproducer does not crash any more.  Even the binary complied against -307 works fine if -292 is used at run time.  I am switching the component to glibc.

Comment 6 Kamil Dudka 2020-01-13 14:17:55 UTC

The reproducer does not crash with -293 but it crashes with -294.  It is likely caused by the fix for bug #1406732.

Comment 8 Carlos O'Donell 2020-01-14 03:32:04 UTC

Created attachment 1652045 [details]
Adjust-security-hardening-changes-for-64-bit-POWER-BE.patch

Comment 11 Alan Modra 2020-01-14 08:09:26 UTC

You shouldn't be getting .dynbss copies of function symbols.  R_PPC64_COPY on __pthread_key_create is the reason for this crash.  With BIND_NOW, PLT entries in shared libraries will be initialized from the .dynbss copy *before* the copy itself is initialized, ie. you'll get a PLT entry of zeros.

Comment 12 Kamil Dudka 2020-01-14 08:31:02 UTC

Thank you for taking quick action on this!

Comment 14 Florian Weimer 2020-01-14 12:58:51 UTC

Reproducer without NSPR dependency:

cat >shared.c <<EOF
#include <dlfcn.h>
#include <pthread.h>

static void *force_linking = pthread_create;

void
call_dlopen (void)
{
  dlopen ("", 0);
}
EOF
cat >main.cc <<EOF
#include <string>

void
use_string ()
{
  std::string unused;
}

extern "C" void call_dlopen ();

int
main ()
{
  call_dlopen ();
  return 0;
}
EOF

gcc -fPIC -shared -o libshared.so shared.c -ldl -lpthread
g++ -Wl,-rpath,. -L. -lshared -o main main.cc
./main

Expected outcome: No output.
Actual outcome: Segmentation fault.

Comment 15 Florian Weimer 2020-01-14 13:52:59 UTC

I've filed a binutils bug with a reproducer which does not depend on how glibc was built:

  https://sourceware.org/bugzilla/show_bug.cgi?id=25384

(Note: This does not mean that we will fix this regression with a binutils update.)

Comment 18 Florian Weimer 2020-01-15 13:56:44 UTC

The upstream binutils fix introduces a text relocation. This is required because GCC mistakenly puts the reference to __pthread_key_create into .rodata, and not .data.relro as it should.

I'm  trying to verify if this has already been fixed in GCC.

Comment 19 Florian Weimer 2020-01-15 14:14:25 UTC

I filed bug 1791321 for the underlying GCC bug (which I think is the ultimate problem here, but fixing GCC is definitely not the way to address the regression).

Comment 20 Carlos O'Donell 2020-01-15 15:48:55 UTC

The plan right now for glibc is to back out some of the 64-bit power be hardening to remove the problematic mixture of copy relocation and BIND_NOW. Overall we are still moving in the right direction of enabling more hardening as time goes on, we just can't enable _all_ the hardening we wanted. This is not a problem for RHEL 8.

Comment 23 Sergey Kolosov 2020-01-24 08:48:53 UTC

Verified with the reproducer from https://bugzilla.redhat.com/show_bug.cgi?id=1790475#c14

Comment 26 errata-xmlrpc 2020-03-31 19:08:32 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0989

Note You need to log in before you can comment on or make changes to this bug.