Note: This bug is displayed in read-only format because
the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
.Programs using `papi` no longer stop when shutting down
Previously, `papi` initialized threads before `papi` initialized some components. Because of this, entries for certain components describing the number of elements in arrays were not set to correct values and zero-sized memory allocations were attempted. As a consequence, later accesses and frees of those zero-sized memory allocations caused the programs to stop.
The bug has been fixed and programs using `papi` no longer stop when shutting down.
Description of problem:
During initialization some threads are run in the wrong order and use uninitialized data. This can cause illegal memory accesses which cause aborts. This can be observed on aarch64 machine. However running the test under valgrind show that x86_64 also has those problematic accesses.
Version-Release number of selected component (if applicable):
papi-6.0.0-14.el9.src.rpm
How reproducible:
everytime.
Steps to Reproduce:
1. dnf install papi-testsuite
2. cd /usr/share/papi
3. ./components/stealtime/tests/stealtime_basic
Actual results:
# ./components/stealtime/tests/stealtime_basic
Trying all stealtime events
Found stealtime component 13 - stealtime
stealtime:::TOTAL value: 0
stealtime:::CPU1 value: 0
stealtime:::CPU2 value: 0
stealtime:::CPU3 value: 0
stealtime:::CPU4 value: 0
Note: for this test the values are expected to all be 0
unless run inside a VM on a busy system.
PASSED
free(): invalid next size (fast)
Aborted (core dumped)
Expected results:
No "invalid next size" or "Aborted (core dumped)" after the "PASS"
./components/stealtime/tests/stealtime_basic
Trying all stealtime events
Found stealtime component 13 - stealtime
stealtime:::TOTAL value: 0
stealtime:::CPU1 value: 0
stealtime:::CPU2 value: 0
stealtime:::CPU3 value: 0
stealtime:::CPU4 value: 0
Note: for this test the values are expected to all be 0
unless run inside a VM on a busy system.
PASSED
Additional info:
This can also be observed with the ./ctests/all_native_events test.
The upstream papi git commit 3625bdbad9fd57d1cdb1e5615854545167d4adcb below addresses the problem
Author: Anthony Castaldo <TonyCastaldo.edu> 2020-08-26 17:18:29
Committer: Anthony Castaldo <TonyCastaldo.edu> 2020-08-26 17:18:29
Parent: 82fdd098d2c1c6aad20b139dcb7a3a6a508b5580 (Merged in master (pull request #126))
Child: 9266f6ebde64883f886793d7a8ce1d475d3589ea (Merged in master (pull request #131))
Branches: master, remotes/origin/master
Follows:
Precedes:
This modifies PAPI_library_init() to initialize components in two classes,
separated by the initialization of the papi thread structure. The first class
is those that need no thread structure, currently everything but perf_event and
perf_event_uncore. Following the init of the threading structure, we init the
second class (perf_event and perf_event_uncore) that DOES need the thread
structure to successfully init_component(). This required a change to
_papi_hwi_init_global(), to add an argument to distinguish which class it
should initialize.
(In reply to William Cohen from comment #1)
> Built papi-6.0.0-15.el9 with the upstream patch to address this issue.
Hi Will,
Thank you very much.
I would like to test papi-6.0.0-15.el9 on FX700.
Could you please tell me the URL to download it?
Best,
Fuli
Hi,
I have tested papi-6.0.0-15.el9 on FX700.
Both ctests/all_native_events and components/stealtime/tests/stealtime_basic were "PASSED" without "invalid next size" or "Aborted (core dumped)".
Best,
Fuli
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory (papi bug fix and enhancement update), and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHBA-2023:6459
Description of problem: During initialization some threads are run in the wrong order and use uninitialized data. This can cause illegal memory accesses which cause aborts. This can be observed on aarch64 machine. However running the test under valgrind show that x86_64 also has those problematic accesses. Version-Release number of selected component (if applicable): papi-6.0.0-14.el9.src.rpm How reproducible: everytime. Steps to Reproduce: 1. dnf install papi-testsuite 2. cd /usr/share/papi 3. ./components/stealtime/tests/stealtime_basic Actual results: # ./components/stealtime/tests/stealtime_basic Trying all stealtime events Found stealtime component 13 - stealtime stealtime:::TOTAL value: 0 stealtime:::CPU1 value: 0 stealtime:::CPU2 value: 0 stealtime:::CPU3 value: 0 stealtime:::CPU4 value: 0 Note: for this test the values are expected to all be 0 unless run inside a VM on a busy system. PASSED free(): invalid next size (fast) Aborted (core dumped) Expected results: No "invalid next size" or "Aborted (core dumped)" after the "PASS" ./components/stealtime/tests/stealtime_basic Trying all stealtime events Found stealtime component 13 - stealtime stealtime:::TOTAL value: 0 stealtime:::CPU1 value: 0 stealtime:::CPU2 value: 0 stealtime:::CPU3 value: 0 stealtime:::CPU4 value: 0 Note: for this test the values are expected to all be 0 unless run inside a VM on a busy system. PASSED Additional info: This can also be observed with the ./ctests/all_native_events test. The upstream papi git commit 3625bdbad9fd57d1cdb1e5615854545167d4adcb below addresses the problem Author: Anthony Castaldo <TonyCastaldo.edu> 2020-08-26 17:18:29 Committer: Anthony Castaldo <TonyCastaldo.edu> 2020-08-26 17:18:29 Parent: 82fdd098d2c1c6aad20b139dcb7a3a6a508b5580 (Merged in master (pull request #126)) Child: 9266f6ebde64883f886793d7a8ce1d475d3589ea (Merged in master (pull request #131)) Branches: master, remotes/origin/master Follows: Precedes: This modifies PAPI_library_init() to initialize components in two classes, separated by the initialization of the papi thread structure. The first class is those that need no thread structure, currently everything but perf_event and perf_event_uncore. Following the init of the threading structure, we init the second class (perf_event and perf_event_uncore) that DOES need the thread structure to successfully init_component(). This required a change to _papi_hwi_init_global(), to add an argument to distinguish which class it should initialize.