Bug 2010480 - glibc: LD_AUDIT performance issues when NOT using PLT auditing.
Summary: glibc: LD_AUDIT performance issues when NOT using PLT auditing.
Keywords:
Status: CLOSED DUPLICATE of bug 2047981
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: glibc
Version: 8.4
Hardware: ppc64le
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: glibc team
QA Contact: qe-baseos-tools-bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-10-04 18:32 UTC by Andrew Mike
Modified: 2023-07-18 14:30 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-07-25 09:18:58 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHELPLAN-98798 0 None None None 2021-10-04 18:33:29 UTC
Sourceware 15533 0 P2 NEW LD_AUDIT introduces an avoidable performance degradation 2021-10-04 18:32:36 UTC

Description Andrew Mike 2021-10-04 18:32:36 UTC
Description of problem:

Glibc adds a huge overhead to auditing by auditing every PLT call, whether the auditor includes PLT auditing or not. For applications that have small routines, the PLT auditing overhead is unacceptable for performance tools.

To avoid these issues, some auditors rewrite the GOT to avoid the performance hit. However, details of filling in GOT table entries are platform dependent and somewhat tricky for load modules that use secure PLT. Moreover, rewriting the GOT by a tool is strongly discouraged and in future processors that technique will be prevented using hardware mechanisms. 

For these reasons, glibc should take responsibility for filling in the GOT if there is no PLT auditor present. 

How reproducible: 100% (by customer)

Steps to Reproduce:
1. git clone https://github.com/hpctoolkit/auditor-tests 
2. cd auditor-tests/tier1/slow-audit-plt
3. make

Actual results:
The time for 2^27 PLT calls to an empty routine on IBM’s POWER9 @ 2.8GHz takes ~3.54 seconds.

Expected results:
The time for 2^27 PLT calls to an empty routine on IBM’s POWER9 @ 2.8GHz should take ~0.32 seconds.

Additional info: 

Problem was first identified upstream: 
https://sourceware.org/bugzilla/show_bug.cgi?id=15533

Comment 7 Florian Weimer 2022-07-25 09:18:58 UTC
Delivered via bug 2047981.

*** This bug has been marked as a duplicate of bug 2047981 ***


Note You need to log in before you can comment on or make changes to this bug.