1871396 – glibc: Improve use of static TLS surplus for optimizations.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1871396 - glibc: Improve use of static TLS surplus for optimizations.

Summary: glibc: Improve use of static TLS surplus for optimizations.

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 8
Classification:	Red Hat
Component:	glibc
Sub Component:
Version:	8.4
Hardware:	All
OS:	Linux
Priority:	unspecified
Severity:	urgent
Target Milestone:	rc
Target Release:	8.4
Assignee:	Florian Weimer
QA Contact:	Sergey Kolosov
Docs Contact:	Zuzana Zoubkova
URL:
Whiteboard:
Depends On:
Blocks:	1796871
TreeView+	depends on / blocked

Reported:	2020-08-23 02:49 UTC by Carlos O'Donell
Modified:	2023-07-18 14:30 UTC (History)
CC List:	15 users (show)
Fixed In Version:	glibc-2.28-145.el8
Doc Type:	Bug Fix
Doc Text:	.The `glibc` dynamic linker now restricts part of the static thread-local storage space to static TLS allocations Previously, the `glibc` dynamic linker used all available static thread-local storage (TLS) space for dynamic TLS, on a first come, first served basis. Consequently, loading additional shared objects at run time using the `dlopen` function sometimes failed, because dynamic TLS allocations had already consumed all available static TLS space. This problem occurred particularly on the 64-bit ARM architecture and IBM Power Systems. Now, the dynamic linker restricts part of the static TLS area to static TLS allocations and does not use this space for dynamic TLS optimizations. As a result, `dlopen` calls succeed in more cases with the default setting. Applications that require more allocated static TLS than the default setting allows can use a new `glibc.rtld.optional_static_tls` tunable.
Clone Of:
Environment:
Last Closed:	2021-05-18 14:36:39 UTC
Type:	Bug
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
ld.so_output (184 bytes, application/zip) 2021-03-15 09:16 UTC, SravanK	no flags	Details
LD_DEBUG output (13.99 MB, application/zip) 2021-03-15 09:48 UTC, SravanK	no flags	Details
Show Obsolete (1) View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
IBM Linux Technology Center	190906	0	None	None	None	2021-01-20 09:01:24 UTC
Sourceware	25051	0	P2	RESOLVED	aarch64, powerpc64 uses surplus static tls for dynamically loaded dsos	2021-02-15 10:47:21 UTC

Description Carlos O'Donell 2020-08-23 02:49:33 UTC

Backport the following commits to improve the use of status TLS surplus:

0c7b002fac12dcb2f53ba83ee56bb3b5d2439447
rtld: Add rtld.nns tunable for the number of supported namespaces

17796419b5fd694348cceb65c3f77601faae082c
rtld: Account static TLS surplus for audit modules

ffb17e7ba3a5ba9632cee97330b325072fbe41dd
rtld: Avoid using up static TLS surplus for optimizations [BZ #25051]

The important tunable to add is the static TLS surplus tunable for users to tune how much they need for their workloads.

Comment 3 Florian Weimer 2021-01-19 15:03:47 UTC

Tulio, would be able to check a current 8.4 build (glibc-2.28-145.el8 or later) to see if it addresses your needs?

I had forgotten about this bug, but the ld.so updates should bring in the relevant updates.

Comment 4 Tulio Magno Quites Machado Filho 2021-01-19 15:20:56 UTC

I've just been contacted about an issue that I suspect is related to this.
I'm still collecting information in order to reproduce the issue.
Assuming that I can reproduce the issue locally, I'll be glad to test it

Comment 7 IBM Bug Proxy 2021-02-02 15:21:24 UTC

------- Comment From tulioqm.com 2021-02-02 10:17 EDT-------
Hi Florian, Red Hat,

Vijay (IBM PowerVC) managed to create an environment that reproduces the customer issue.
We together executed this test and confirmed that glibc-2.28-145.el8 and related packages did fix the issue we were seeing.

With that said, I'm marking this bug as verified.

Thank you!

Comment 9 Divya 2021-02-03 06:08:01 UTC

We see that in the same environment, this problem is sometimes reproducible and sometimes not. Can anyone list down the steps on how this can be reproduced?

Comment 10 Florian Weimer 2021-02-03 07:10:24 UTC

(In reply to Divya from comment #9)
> We see that in the same environment, this problem is sometimes reproducible
> and sometimes not. Can anyone list down the steps on how this can be
> reproduced?

Is this a question for Red Hat or IBM?

We have seen a complex reproducer involving the Fedora installer (anaconda). The backported patch also comes with a test case, but making this test architecture-independent is a bit tricky. You may have to fudge the TLS sizes in the test a bit in order to reproduce the issue without the patch.

Comment 11 IBM Bug Proxy 2021-02-03 13:10:30 UTC

------- Comment From dikonoor.com 2021-02-03 06:55 EDT-------
PowerVC 2.0 GA happened last Dec and we have many environments (internal and external), who are hitting this issue. There is a possibility that any customer trying to install PowerVC on RHEL8.3 will hit this issue. If a fix is anyway available for RHEL8.4 and if it works for RHEL8.3 , we really need this fix to be made available to customers using RHEL8.3 (as part of RHEL8.3 next rolling update). Otherwise, this bug will severely impact installation and adoption of PowerVC 2.0. Apart from our on-prem customers, IBM Cloud will also move to this release soon and they will also be impacted. A custom who wants to install PowerVC today cannot wait for RHEL8.4 to be GAed in May 2021. Request you to please take this as a high priority bug and do the needful so that this fix is made available in 8.3 at the earliest. Reopening the bug for the same reason.

Comment 12 Florian Weimer 2021-02-03 13:28:34 UTC

(In reply to IBM Bug Proxy from comment #11)
> ------- Comment From dikonoor.com 2021-02-03 06:55 EDT-------
> PowerVC 2.0 GA happened last Dec and we have many environments (internal and
> external), who are hitting this issue. There is a possibility that any
> customer trying to install PowerVC on RHEL8.3 will hit this issue. If a fix
> is anyway available for RHEL8.4 and if it works for RHEL8.3 , we really need
> this fix to be made available to customers using RHEL8.3 (as part of RHEL8.3
> next rolling update). Otherwise, this bug will severely impact installation
> and adoption of PowerVC 2.0. Apart from our on-prem customers, IBM Cloud
> will also move to this release soon and they will also be impacted. A custom
> who wants to install PowerVC today cannot wait for RHEL8.4 to be GAed in May
> 2021. Request you to please take this as a high priority bug and do the
> needful so that this fix is made available in 8.3 at the earliest. Reopening
> the bug for the same reason.

Are the problems you see with PowerVC a regression compared to Red Hat Enterprise Linux 7 or 8.2?

I do not think this bug is something that we can successfully backport in to a z-stream release, sorry. It is only scheduled for inclusion into 8.4.0 because we deemed it too intertwined with other changes that we were backporting. There is also some lead time for 8.3.z updates, effectively narrowing the gap between the theoretical availability of an 8.3.z update and 8.4.0 GA to a few weeks.

It is very likely that it is possible to address the issue in PowerVC itself, with few (if any) code changes. We would have to look at what precisely triggers the dlopen failure, and find the best way to mitigate that. We should probably move the details to an off-bug discussion.

Comment 16 IBM Bug Proxy 2021-02-25 14:01:56 UTC

------- Comment From tulioqm.com 2021-02-25 08:54 EDT-------
> Are the problems you see with PowerVC a regression compared to Red Hat
> Enterprise Linux 7 or 8.2?

The bug is being closed, but I think it's worth documenting the following for the future:

We've just found out that libmysqlclient has a new patch [1] forcing the usage of static TLS on mysql-libs-8.0.21-1.module+el8.2.0+7855+47abd494.

Package mysql-libs-8.0.17-3.module+el8.0.0+3898+e09bb8de is the last build without this patch.

There is a suspicious that more libraries started to use static TLS recently. But we haven't identified them yet.

[1] https://github.com/mysql/mysql-server/commit/735bd2a53834266c7256830c8d34672ea55fe17b

Comment 18 Divya 2021-03-07 15:40:48 UTC

After much experiments, it is found that we are able to consistency reproduce this problem in RHEL8.3 (haven't tried on 8.2) environments where OS installation was performed (server or server with GUI) with additional packages (and our OpenStack based product is installed on top of it). All other things kept the same, when OS installation is performed with no additional packages we never run into this.

When additional packages are installed, we run into this problem when OpenStack nova DB sync command is run:

 [-] Traceback (most recent call last):
 [-] File "/usr/lib/python3.6/site-packages/nova/db/sqlalchemy/migration.py", line 93, in db_version
 [-] return _db_version(repository, database, context)
 [-] File "/usr/lib/python3.6/site-packages/nova/db/sqlalchemy/migration.py", line 100, in _db_version
 [-] return versioning_api.db_version(get_engine(database, context=context),
 [-] File "/usr/lib/python3.6/site-packages/nova/db/sqlalchemy/migration.py", line 41, in get_engine
 [-] return db_session.get_engine(context=context)
 [-] File "/usr/lib/python3.6/site-packages/nova/db/sqlalchemy/api.py", line 148, in get_engine
 [-] return ctxt_mgr.writer.get_engine()
 [-] File "/usr/lib/python3.6/site-packages/oslo_db/sqlalchemy/enginefacade.py", line 832, in get_engine
 [-] return self._factory.get_writer_engine()
 [-] File "/usr/lib/python3.6/site-packages/oslo_db/sqlalchemy/enginefacade.py", line 372, in get_writer_engine
 [-] self._start()
 [-] File "/usr/lib/python3.6/site-packages/oslo_db/sqlalchemy/enginefacade.py", line 510, in _start
 [-] engine_args, maker_args)
 [-] File "/usr/lib/python3.6/site-packages/oslo_db/sqlalchemy/enginefacade.py", line 534, in _setup_for_connection
 [-] sql_connection=sql_connection, **engine_kwargs)
 [-] File "/usr/lib/python3.6/site-packages/debtcollector/renames.py", line 43, in decorator
 [-] return wrapped(*args, **kwargs)
 [-] File "/usr/lib/python3.6/site-packages/oslo_db/sqlalchemy/engines.py", line 177, in create_engine
 [-] engine = sqlalchemy.create_engine(url, **engine_args)
 [-] File "/usr/lib64/python3.6/site-packages/sqlalchemy/engine/__init__.py", line 479, in create_engine
 [-] return strategy.create(*args, **kwargs)
 [-] File "/usr/lib64/python3.6/site-packages/sqlalchemy/engine/strategies.py", line 87, in create
 [-] dbapi = dialect_cls.dbapi(**dbapi_args)
 [-] File "/usr/lib64/python3.6/site-packages/sqlalchemy/dialects/mysql/mysqldb.py", line 118, in dbapi
 [-] return __import__("MySQLdb")
 [-] File "/usr/lib64/python3.6/site-packages/MySQLdb/__init__.py", line 18, in <module>
 [-] from . import _mysql
 [-] ImportError: /lib64/libstdc++.so.6: cannot allocate memory in static TLS block

We tried some workarounds like export LD_PRELOAD=/usr/lib64/mysql/libmysqlclient.so.21 > that did help with proceeding with the installation (i e nova DB sync works without errors) but we start seeing TLS errors when OpenStack nova service tries to start. So this workaround was not very helpful.

Comment 19 Florian Weimer 2021-03-08 12:27:00 UTC

Divya, I would appreciate if you could attach LD_DEBUG=all output from the failing process to this bug. Thanks.

Is abrt installed on the system by chance? I think it loads the Python rpm module, which brings in a lot of additional dependencies.

Comment 20 Divya 2021-03-11 04:29:02 UTC

# rpm -qa  | grep abrt
abrt-addon-coredump-helper-2.10.9-20.el8.ppc64le
abrt-addon-pstoreoops-2.10.9-20.el8.ppc64le
python3-abrt-2.10.9-20.el8.ppc64le
abrt-addon-ccpp-2.10.9-20.el8.ppc64le
abrt-cli-2.10.9-20.el8.ppc64le
abrt-libs-2.10.9-20.el8.ppc64le
abrt-dbus-2.10.9-20.el8.ppc64le
abrt-addon-vmcore-2.10.9-20.el8.ppc64le
abrt-addon-kerneloops-2.10.9-20.el8.ppc64le
abrt-addon-xorg-2.10.9-20.el8.ppc64le
abrt-tui-2.10.9-20.el8.ppc64le
abrt-2.10.9-20.el8.ppc64le
python3-abrt-addon-2.10.9-20.el8.ppc64le

We will try and make LD_DEBUG O/P available.

Comment 21 SravanK 2021-03-15 09:16:50 UTC

Created attachment 1763336 [details]
ld.so_output

Comment 22 SravanK 2021-03-15 09:18:14 UTC

Hi Florian,


I have zipped ld.so output and uploaded as attachment, please do check and revert if you need any other information. Thanks

Comment 23 Florian Weimer 2021-03-15 09:22:12 UTC

Thanks, but the ZIP file appears to be empty:

Archive:  /tmp/bugzilla-1871396.zip
 Length   Method    Size  Cmpr    Date    Time   CRC-32   Name
--------  ------  ------- ---- ---------- ----- --------  ----
       0  Stored        0   0% 03-11-2021 09:13 00000000  bugzilla-1871396/
--------          -------  ---                            -------
       0                0   0%                            1 file

Comment 24 SravanK 2021-03-15 09:48:43 UTC

Created attachment 1763343 [details]
LD_DEBUG output

Comment 25 SravanK 2021-03-15 09:49:41 UTC

some issue with earlier one,, please refer to new one. Thanks

Comment 26 Florian Weimer 2021-03-15 09:59:01 UTC

*sigh* I forgot how useless the LD_DEBUG output is for this purposes. So I don't think this sheds much light on what is going on, sorry.

Comment 30 errata-xmlrpc 2021-05-18 14:36:39 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: glibc security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:1585

Note You need to log in before you can comment on or make changes to this bug.