RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2215582 - papi initialization threads run in the wrong order
Summary: papi initialization threads run in the wrong order
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: papi
Version: 9.3
Hardware: aarch64
OS: Linux
unspecified
medium
Target Milestone: rc
: ---
Assignee: William Cohen
QA Contact: Lenka Špačková
Jacob Taylor Valdez
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-06-16 15:33 UTC by William Cohen
Modified: 2023-12-14 09:47 UTC (History)
8 users (show)

Fixed In Version: papi-6.0.0-15.el9
Doc Type: Bug Fix
Doc Text:
.Programs using `papi` no longer stop when shutting down Previously, `papi` initialized threads before `papi` initialized some components. Because of this, entries for certain components describing the number of elements in arrays were not set to correct values and zero-sized memory allocations were attempted. As a consequence, later accesses and frees of those zero-sized memory allocations caused the programs to stop. The bug has been fixed and programs using `papi` no longer stop when shutting down.
Clone Of:
Environment:
Last Closed: 2023-11-07 08:33:39 UTC
Type: Bug
Target Upstream Version:
Embargoed:
pm-rhel: mirror+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHELPLAN-160101 0 None None None 2023-06-16 15:34:44 UTC
Red Hat Product Errata RHBA-2023:6459 0 None None None 2023-11-07 08:33:45 UTC

Description William Cohen 2023-06-16 15:33:50 UTC
Description of problem:

During initialization some threads are run in the wrong order and use uninitialized data.  This can cause illegal memory accesses which cause aborts.  This can be observed on aarch64 machine.  However running the test under valgrind show that x86_64 also has those problematic accesses.


Version-Release number of selected component (if applicable):

papi-6.0.0-14.el9.src.rpm

How reproducible:

everytime.


Steps to Reproduce:
1. dnf install papi-testsuite
2. cd /usr/share/papi
3. ./components/stealtime/tests/stealtime_basic

Actual results:

#  ./components/stealtime/tests/stealtime_basic  
Trying all stealtime events
	Found stealtime component 13 - stealtime
  stealtime:::TOTAL  value: 0
  stealtime:::CPU1  value: 0
  stealtime:::CPU2  value: 0
  stealtime:::CPU3  value: 0
  stealtime:::CPU4  value: 0
Note: for this test the values are expected to all be 0
	 unless run inside a VM on a busy system.
PASSED
free(): invalid next size (fast)
Aborted (core dumped)


Expected results:

No "invalid next size" or "Aborted (core dumped)" after the "PASS"


  ./components/stealtime/tests/stealtime_basic  
Trying all stealtime events
	Found stealtime component 13 - stealtime
  stealtime:::TOTAL  value: 0
  stealtime:::CPU1  value: 0
  stealtime:::CPU2  value: 0
  stealtime:::CPU3  value: 0
  stealtime:::CPU4  value: 0
Note: for this test the values are expected to all be 0
	 unless run inside a VM on a busy system.
PASSED


Additional info:

This can also be observed with the ./ctests/all_native_events test.

The upstream papi git commit 3625bdbad9fd57d1cdb1e5615854545167d4adcb below addresses the problem


Author: Anthony Castaldo <TonyCastaldo.edu>  2020-08-26 17:18:29
Committer: Anthony Castaldo <TonyCastaldo.edu>  2020-08-26 17:18:29
Parent: 82fdd098d2c1c6aad20b139dcb7a3a6a508b5580 (Merged in master (pull request #126))
Child:  9266f6ebde64883f886793d7a8ce1d475d3589ea (Merged in master (pull request #131))
Branches: master, remotes/origin/master
Follows: 
Precedes: 

    This modifies PAPI_library_init() to initialize components in two classes,
    separated by the initialization of the papi thread structure.  The first class
    is those that need no thread structure, currently everything but perf_event and
    perf_event_uncore. Following the init of the threading structure, we init the
    second class (perf_event and perf_event_uncore) that DOES need the thread
    structure to successfully init_component().  This required a change to
    _papi_hwi_init_global(), to add an argument to distinguish which class it
    should initialize.

Comment 1 William Cohen 2023-06-20 16:13:06 UTC
Built papi-6.0.0-15.el9 with the upstream patch to address this issue.

Comment 2 QI Fuli 2023-06-20 17:50:58 UTC
(In reply to William Cohen from comment #1)
> Built papi-6.0.0-15.el9 with the upstream patch to address this issue.

Hi Will,

Thank you very much.
I would like to test papi-6.0.0-15.el9 on FX700.
Could you please tell me the URL to download it?

Best,
Fuli

Comment 3 William Cohen 2023-06-20 23:53:17 UTC
Answered the question on internal chat.

Comment 5 QI Fuli 2023-06-21 18:38:53 UTC
Hi,

I have tested papi-6.0.0-15.el9 on FX700.
Both ctests/all_native_events and components/stealtime/tests/stealtime_basic were "PASSED" without "invalid next size" or "Aborted (core dumped)".

Best,
Fuli

Comment 10 William Cohen 2023-08-02 12:24:26 UTC
The doc text looks fine to me.

Comment 13 errata-xmlrpc 2023-11-07 08:33:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (papi bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:6459


Note You need to log in before you can comment on or make changes to this bug.