Bug 2213907
| Summary: | glibc: Memcpy throughput lower on RH9.3 compared to RHEL 8.3/RHEL 7.5 - same Skylake hardware | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 9 | Reporter: | Carlos O'Donell <codonell> |
| Component: | glibc | Assignee: | DJ Delorie <dj> |
| Status: | MODIFIED --- | QA Contact: | Sergey Kolosov <skolosov> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 9.3 | CC: | ashankar, barend.havenga, bgray, codonell, dj, fweimer, jmario, pfrankli, qe-baseos-tools-bugs, sipoyare, skolosov |
| Target Milestone: | rc | Keywords: | Regression, Triaged |
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | glibc-2.34-82.el9 | Doc Type: | Enhancement |
| Doc Text: |
Feature: Improved string and memory routine performance on Intel Skylake-based hardware.
Reason: The default amount of cache to use for string and memory routine performance is a balance between single process and whole system performance. It was found that on Intel Skylake-based systems the tuning could result in lower than expected performance. The default amount of cache to use for string and memory routines was reviewed against industry standard benchmarks.
Result: the default amount of cache to use has been increased to improve performance.
|
Story Points: | --- |
| Clone Of: | 2180462 | Environment: | |
| Last Closed: | Type: | Bug | |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 2180462 | ||
| Bug Blocks: | 2166710 | ||
|
Comment 1
Carlos O'Donell
2023-06-09 20:43:13 UTC
In particular, this should include a backport of this commit to benefit TDX environments as they exist today:
commit ed2f9dc9420c4c61436328778a70459d0a35556a
Author: Noah Goldstein <goldstein.w.n>
Date: Mon May 8 22:10:20 2023 -0500
x86: Use 64MB as nt-store threshold if no cacheinfo [BZ #30429]
If `non_temporal_threshold` is below `minimum_non_temporal_threshold`,
it almost certainly means we failed to read the systems cache info.
In this case, rather than defaulting the minimum correct value, we
should default to a value that gets at least reasonable
performance. 64MB is chosen conservatively to be at the very high
end. This should never cause non-temporal stores when, if we had read
cache info, we wouldn't have otherwise.
Reviewed-by: Florian Weimer <fweimer>
|