Bug 2028296

Summary: glibc: TLS performance degradation when loading two or more threads
Product: Red Hat Enterprise Linux 8 Reporter: Andrew Mike <amike>
Component: glibcAssignee: glibc team <glibc-bugzilla>
Status: NEW --- QA Contact: qe-baseos-tools-bugs
Severity: medium Docs Contact:
Priority: medium    
Version: 8.5CC: alanm, ashankar, brclark, casantos, codonell, dj, fweimer, jwright, mkielian, mkolbas, pfrankli, sipoyare
Target Milestone: rcKeywords: Bugfix, Triaged
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
TLS test case none

Description Andrew Mike 2021-12-01 22:58:24 UTC
Created attachment 1844409 [details]
TLS test case

Description of problem: When two separate threads load TLS library functions sequentially, one thread will be very slow due to a generation counter mismatch (and thus glibc thinking it needs to reallocate memory for it).

Version-Release number of selected component (if applicable):
2.28-164.el8.x86_64

How reproducible: 100%

Steps to Reproduce:
1. yum install gcc gcc-c++ make glibc-devel openssl-devel
2. Unzip shell archive with test case.
3. Run "make".
4. Execute program "tls-test".

Actual results:
One thread is slower than the other to access TLS variables:

none loaded
  main normal variable          : 554.770ms
  main thread-local variable    : 578.829ms
lib1 loaded
  main normal variable          : 536.941ms
  main thread-local variable    : 504.300ms
  lib1 variable                 : 2079.362ms
lib2 loaded
  main normal variable          : 451.575ms
  main thread-local variable    : 434.603ms
  lib1 variable                 : 5567.543ms
lib2 accessed
  main normal variable          : 424.644ms
  main thread-local variable    : 429.140ms
  lib1 variable                 : 1911.933ms

Expected results: lib1 variable access time is consistent.

Additional info:

- Issue was first noted in 2016 (https://patchwork.ozlabs.org/project/glibc/patch/1465309688.1188.19.camel@mailbox.tu-dresden.de/), and a patch was proposed.

Comment 2 Florian Weimer 2021-12-02 07:32:36 UTC
Under bug 1991001, we are considering backporting changes to the DTV TLS management, so that it aligns with upstream. This is a prerequisite for backporting an eventual upstream fix for this bug here, which does not exist at this time.

We backported the glibc.rtld.optional_static_tls upstream tunable as part of bug 1817513. With the tunable, it is possible to get initial-exec TLS in dlopen'ed shared objects working in more cases. Initial-exec TLS does not suffer from the performance degradation, so it might be an alternative approach. For instance, glibc malloc uses initial-exec TLS to access its thread-local data, so it is not affected by this.