Bug 1468807

Summary: glibc: Support broken applications which call __tls_get_addr with an unaligned stack (GCC bug workaround)
Product: Red Hat Enterprise Linux 7 Reporter: Florian Weimer <fweimer>
Component: glibcAssignee: Florian Weimer <fweimer>
Status: CLOSED ERRATA QA Contact: Sergey Kolosov <skolosov>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.4CC: ashankar, codonell, fweimer, mnewsome, pfrankli, skolosov
Target Milestone: rcKeywords: Patch
Target Release: ---   
Hardware: x86_64   
OS: Unspecified   
Whiteboard:
Fixed In Version: glibc-2.17-210.el7 Doc Type: Enhancement
Doc Text:
Feature: In the slow path of the implementation of __tls_get_addr function in glibc, the stack is automatically aligned as needed. Reason: A bug in the GCC compiler for the x86-64 architecture could sometimes result in a call to the __tls_get_addr function with a misaligned stack, violating ABI requirements. This could result in crashes during TLS access, particularly if an interposed custom malloc is used. Result: Binaries compiled with GCC which suffer from this ABI compliance issue work as expected, even with an interposed custom malloc.
Story Points: ---
Clone Of: 1440287 Environment:
Last Closed: 2018-04-10 14:00:52 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1440287    
Bug Blocks: 1473718    

Description Florian Weimer 2017-07-08 08:10:42 UTC
We should consider fixing this in Red Hat Enterprise Linux as well because it is visible with an interposed malloc (which could use vector instructions; the main malloc code in glibc does not when compiled with the system compiler).

+++ This bug was initially created as a clone of Bug #1440287 +++

Description of problem:
Every 64bit game that uses Unity (the game engine) won't start on Fedora 26 (deadlocks on a black screen). 32bit games work fine.

[…]

--- Additional comment from Nicholas Miell on 2017-06-17 20:42:52 CEST ---

I'm seeing similar Unity hangs, except the sequence of events is:

1. Thread attempts to lazy init a TLS variable.
2. There is a SIGSEGV in _int_malloc() on a movaps %xmm0,(%rsp) instruction because RSP is misaligned due to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58066
3. The Mono SIGSEGV handler gets called.
4. The SIGSEGV handler attempts to lazy init a TLS variable.
5. malloc deadlocks.

LD_PRELOADing the following simple stub fixes the affected Unity games:

#include <dlfcn.h>

__attribute__((force_align_arg_pointer)) void *__tls_get_addr (void *ti)
{
	void *(*tga)(void*) = dlsym(RTLD_NEXT, "__tls_get_addr");

	return tga(ti);
}

Why doesn't glibc's __tls_get_addr() have this attribute to deal with this gcc bug?

--- Additional comment from Nicholas Miell on 2017-06-17 20:46:21 CEST ---

Actually, the stack traces I'm seeing are literally identical to comment #0, except the signal is definitely SIGSEGV (from a MOVAPS instruction), not SIGPWR (from Mono's garbage collector).

--- Additional comment from Nicholas Miell on 2017-06-17 21:05 CEST ---

unzip align-tls-get-addr.zip
cd align-tls-get-addr
meson BUILD
ninja -C BUILD
LD_PRELOAD=$(pwd)/BUILD/tls_get_addr.so /path/to/bug1440287/repro.x86_64

For Steam, Set Launch Options to LD_PRELOAD=/full/path/to/align-tls-get-addr/BUILD/tls_get_addr.so %command%

--- Additional comment from Nicholas Miell on 2017-06-18 06:44 CEST ---

A proposed patch for glibc.

--- Additional comment from Florian Weimer on 2017-06-18 11:14:28 CEST ---

Thanks for tracking this down.  We will likely use a different approach upstream, involving a compatibility symbol for future glibc versions, and use a sysdeps override for backports.

--- Additional comment from Florian Weimer on 2017-06-29 15:55:49 CEST ---

Upstream patch posted for review: https://sourceware.org/ml/libc-alpha/2017-06/msg00922.html

[…]

Comment 1 Florian Weimer 2017-07-08 08:11:55 UTC
Final upstream commit:

commit 031e519c95c069abe4e4c7c59e2b4b67efccdee5
Author: H.J. Lu <hjl.tools>
Date:   Thu Jul 6 04:43:06 2017 -0700

    x86-64: Align the stack in __tls_get_addr [BZ #21609]

Comment 8 errata-xmlrpc 2018-04-10 14:00:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:0805