Bug 2028296 - glibc: TLS performance degradation when loading two or more threads
Summary: glibc: TLS performance degradation when loading two or more threads
Keywords:
Status: NEW
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: glibc
Version: 8.5
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: glibc team
QA Contact: qe-baseos-tools-bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-12-01 22:58 UTC by Andrew Mike
Modified: 2023-07-09 12:57 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
TLS test case (5.14 KB, application/x-shellscript)
2021-12-01 22:58 UTC, Andrew Mike
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHELPLAN-104505 0 None None None 2021-12-01 23:01:27 UTC
Sourceware 19924 0 P2 NEW TLS performance degradation after dlopen 2021-12-02 07:32:36 UTC

Description Andrew Mike 2021-12-01 22:58:24 UTC
Created attachment 1844409 [details]
TLS test case

Description of problem: When two separate threads load TLS library functions sequentially, one thread will be very slow due to a generation counter mismatch (and thus glibc thinking it needs to reallocate memory for it).

Version-Release number of selected component (if applicable):
2.28-164.el8.x86_64

How reproducible: 100%

Steps to Reproduce:
1. yum install gcc gcc-c++ make glibc-devel openssl-devel
2. Unzip shell archive with test case.
3. Run "make".
4. Execute program "tls-test".

Actual results:
One thread is slower than the other to access TLS variables:

none loaded
  main normal variable          : 554.770ms
  main thread-local variable    : 578.829ms
lib1 loaded
  main normal variable          : 536.941ms
  main thread-local variable    : 504.300ms
  lib1 variable                 : 2079.362ms
lib2 loaded
  main normal variable          : 451.575ms
  main thread-local variable    : 434.603ms
  lib1 variable                 : 5567.543ms
lib2 accessed
  main normal variable          : 424.644ms
  main thread-local variable    : 429.140ms
  lib1 variable                 : 1911.933ms

Expected results: lib1 variable access time is consistent.

Additional info:

- Issue was first noted in 2016 (https://patchwork.ozlabs.org/project/glibc/patch/1465309688.1188.19.camel@mailbox.tu-dresden.de/), and a patch was proposed.

Comment 2 Florian Weimer 2021-12-02 07:32:36 UTC
Under bug 1991001, we are considering backporting changes to the DTV TLS management, so that it aligns with upstream. This is a prerequisite for backporting an eventual upstream fix for this bug here, which does not exist at this time.

We backported the glibc.rtld.optional_static_tls upstream tunable as part of bug 1817513. With the tunable, it is possible to get initial-exec TLS in dlopen'ed shared objects working in more cases. Initial-exec TLS does not suffer from the performance degradation, so it might be an alternative approach. For instance, glibc malloc uses initial-exec TLS to access its thread-local data, so it is not affected by this.


Note You need to log in before you can comment on or make changes to this bug.