Bug 2007883
| Summary: | [abrt] papi-testsuite: _papi_hwi_cleanup_eventset(): all_native_events killed by SIGABRT | ||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Török Edwin <edwin+bugs> | ||||||||||||||||||||||
| Component: | papi | Assignee: | William Cohen <wcohen> | ||||||||||||||||||||||
| Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||||||||||||||||
| Severity: | unspecified | Docs Contact: | |||||||||||||||||||||||
| Priority: | unspecified | ||||||||||||||||||||||||
| Version: | 34 | CC: | lberk, wcohen | ||||||||||||||||||||||
| Target Milestone: | --- | ||||||||||||||||||||||||
| Target Release: | --- | ||||||||||||||||||||||||
| Hardware: | x86_64 | ||||||||||||||||||||||||
| OS: | Unspecified | ||||||||||||||||||||||||
| URL: | https://retrace.fedoraproject.org/faf/reports/bthash/096a835d7d0f103d40af8ab6c20f7772be8123df | ||||||||||||||||||||||||
| Whiteboard: | abrt_hash:ef320002a7e0c93e3a9843e3605eff05f128bcf7;VARIANT_ID=workstation; | ||||||||||||||||||||||||
| Fixed In Version: | papi-6.0.0-10.fc34 | Doc Type: | If docs needed, set a value | ||||||||||||||||||||||
| Doc Text: | Story Points: | --- | |||||||||||||||||||||||
| Clone Of: | Environment: | ||||||||||||||||||||||||
| Last Closed: | 2021-11-28 01:09:32 UTC | Type: | --- | ||||||||||||||||||||||
| Regression: | --- | Mount Type: | --- | ||||||||||||||||||||||
| Documentation: | --- | CRM: | |||||||||||||||||||||||
| Verified Versions: | Category: | --- | |||||||||||||||||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||||||||||
| Embargoed: | |||||||||||||||||||||||||
| Attachments: |
|
||||||||||||||||||||||||
|
Description
Török Edwin
2021-09-25 22:22:24 UTC
Created attachment 1826249 [details]
File: backtrace
Created attachment 1826250 [details]
File: core_backtrace
Created attachment 1826251 [details]
File: cpuinfo
Created attachment 1826252 [details]
File: dso_list
Created attachment 1826253 [details]
File: environ
Created attachment 1826254 [details]
File: limits
Created attachment 1826255 [details]
File: maps
Created attachment 1826256 [details]
File: mountinfo
Created attachment 1826257 [details]
File: open_fds
Created attachment 1826258 [details]
File: proc_pid_status
I suspect there is something related to the specific processor implementation triggering a problem. I don't have access to an AMD Ryzen 9 3900X machine. Could you run the test under valgrind and report the results: valgrind ./ctests/all_native_events TESTS_QUIET Also provide the results of: papi_avail -a papi_avail -a output:
aAvailable PAPI preset and user defined events plus hardware information.
--------------------------------------------------------------------------------
PAPI version : 6.0.0.0
Operating system : Linux 5.14.11-200.fc34.x86_64
Vendor string and code : AuthenticAMD (2, 0x2)
Model string and code : AMD Ryzen 9 3900X 12-Core Processor (113, 0x71)
CPU revision : 0.000000
CPUID : Family/Model/Stepping 23/113/0, 0x17/0x71/0x00
CPU Max MHz : 4672
CPU Min MHz : 2200
Total cores : 24
SMT threads per core : 2
Cores per socket : 12
Sockets : 1
Cores per NUMA region : 24
NUMA regions : 1
Running in a VM : no
Number Hardware Counters : 5
Max Multiplex Counters : 384
Fast counter read (rdpmc): yes
--------------------------------------------------------------------------------
================================================================================
PAPI Preset Events
================================================================================
Name Code Deriv Description (Note)
PAPI_L1_ICM 0x80000001 No Level 1 instruction cache misses
PAPI_TLB_DM 0x80000014 No Data translation lookaside buffer misses
PAPI_TLB_IM 0x80000015 Yes Instruction translation lookaside buffer misses
PAPI_STL_ICY 0x80000025 No Cycles with no instruction issue
PAPI_BR_TKN 0x8000002c No Conditional branch instructions taken
PAPI_BR_MSP 0x8000002e No Conditional branch instructions mispredicted
PAPI_TOT_INS 0x80000032 No Instructions completed
PAPI_FP_INS 0x80000034 No Floating point instructions
PAPI_BR_INS 0x80000037 No Branch instructions
PAPI_VEC_INS 0x80000038 No Vector/SIMD instructions (could include integer)
PAPI_TOT_CYC 0x8000003b No Total cycles
PAPI_L1_DCA 0x80000040 No Level 1 data cache accesses
PAPI_L1_ICH 0x80000049 Yes Level 1 instruction cache hits
PAPI_L1_ICA 0x8000004c No Level 1 instruction cache accesses
PAPI_L2_ICA 0x8000004d No Level 2 instruction cache accesses
PAPI_L1_ICR 0x8000004f No Level 1 instruction cache reads
PAPI_L1_TCA 0x80000058 Yes Level 1 total cache accesses
PAPI_FML_INS 0x80000061 No Floating point multiply instructions
PAPI_FAD_INS 0x80000062 No Floating point add instructions
PAPI_FDV_INS 0x80000063 No Floating point divide instructions (Counts both divide and square root instructions)
PAPI_FSQ_INS 0x80000064 No Floating point square root instructions (Counts both divide and square root instructions)
PAPI_FP_OPS 0x80000066 No Floating point operations
PAPI_SP_OPS 0x80000067 No Floating point operations; optimized to count scaled single precision vector operations
PAPI_DP_OPS 0x80000068 No Floating point operations; optimized to count scaled double precision vector operations
--------------------------------------------------------------------------------
Of 24 available events, 3 are derived.
And here is valgrind output, unfortunately it looks like it doesn't know how to emulate rdpmc:
valgrind ./ctests/all_native_events
==64949== Memcheck, a memory error detector
==64949== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==64949== Using Valgrind-3.17.0 and LibVEX; rerun with -h for copyright info
==64949== Command: ./ctests/all_native_events
==64949==
PAPI Error: Couldn't open hw_instructions in exclude_guest=0 test
Test case ALL_NATIVE_EVENTS: Available native events and hardware information.
vex amd64->IR: unhandled instruction bytes: 0xF 0x33 0x8B 0x4E 0x8 0x44 0x39 0xD1 0xF 0x84
vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0
vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=0F
vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0
==64949== valgrind: Unrecognised instruction at address 0x48993c7.
==64949== at 0x48993C7: rdpmc (perf_helpers.h:48)
==64949== by 0x48993C7: mmap_read_self (perf_helpers.h:130)
==64949== by 0x48993C7: _pe_rdpmc_read (perf_event.c:1115)
==64949== by 0x48993C7: _pe_read (perf_event.c:1281)
==64949== by 0x48862D8: _papi_hwi_read (papi_internal.c:1710)
==64949== by 0x4880BB8: PAPI_stop (papi.c:2888)
==64949== by 0x10A7DC: ??? (in /usr/share/papi/ctests/all_native_events)
==64949== by 0x10A9AF: ??? (in /usr/share/papi/ctests/all_native_events)
==64949== by 0x491CB74: (below main) (libc-start.c:332)
==64949== Your program just tried to execute an instruction that Valgrind
==64949== did not recognise. There are two possible reasons for this.
==64949== 1. Your program has a bug and erroneously jumped to a non-code
==64949== location. If you are running Memcheck and you just saw a
==64949== warning about a bad jump, it's probably your program's fault.
==64949== 2. The instruction is legitimate but Valgrind doesn't handle it,
==64949== i.e. it's Valgrind's fault. If you think this is the case or
==64949== you are not sure, please let us know and we'll try to fix it.
==64949== Either way, Valgrind will now raise a SIGILL signal which will
==64949== probably kill your program.
==64949==
==64949== Process terminating with default action of signal 4 (SIGILL): dumping core
==64949== Illegal opcode at address 0x48993C7
==64949== at 0x48993C7: rdpmc (perf_helpers.h:48)
==64949== by 0x48993C7: mmap_read_self (perf_helpers.h:130)
==64949== by 0x48993C7: _pe_rdpmc_read (perf_event.c:1115)
==64949== by 0x48993C7: _pe_read (perf_event.c:1281)
==64949== by 0x48862D8: _papi_hwi_read (papi_internal.c:1710)
==64949== by 0x4880BB8: PAPI_stop (papi.c:2888)
==64949== by 0x10A7DC: ??? (in /usr/share/papi/ctests/all_native_events)
==64949== by 0x10A9AF: ??? (in /usr/share/papi/ctests/all_native_events)
==64949== by 0x491CB74: (below main) (libc-start.c:332)
==64949==
==64949== HEAP SUMMARY:
==64949== in use at exit: 465,593 bytes in 592 blocks
==64949== total heap usage: 3,404 allocs, 2,812 frees, 3,474,443 bytes allocated
==64949==
==64949== LEAK SUMMARY:
==64949== definitely lost: 3,789 bytes in 42 blocks
==64949== indirectly lost: 0 bytes in 0 blocks
==64949== possibly lost: 0 bytes in 0 blocks
==64949== still reachable: 461,804 bytes in 550 blocks
==64949== suppressed: 0 bytes in 0 blocks
==64949== Rerun with --leak-check=full to see details of leaked memory
==64949==
==64949== For lists of detected and suppressed errors, rerun with: -s
==64949== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
[1] 64949 illegal hardware instruction (core dumped) valgrind ./ctests/all_native_events
However I suspect this is related to the stealtime bug:
https://bugzilla.redhat.com/show_bug.cgi?id=2007882 and https://bugzilla.redhat.com/show_bug.cgi?id=2007877
which appears to be a problem with the initialization order (try to access an array that got allocated with size 0)
Backported following patch that addresses the problem
commit 3625bdbad9fd57d1cdb1e5615854545167d4adcb
Author: Anthony Castaldo <TonyCastaldo.edu>
Date: Wed Aug 26 17:18:29 2020 -0400
This modifies PAPI_library_init() to initialize components in two classes,
separated by the initialization of the papi thread structure. The first class
is those that need no thread structure, currently everything but perf_event and
perf_event_uncore. Following the init of the threading structure, we init the
second class (perf_event and perf_event_uncore) that DOES need the thread
structure to successfully init_component(). This required a change to
_papi_hwi_init_global(), to add an argument to distinguish which class it
should initialize.
FEDORA-2021-752e807fdd has been pushed to the Fedora 34 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2021-752e807fdd` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2021-752e807fdd See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates. FEDORA-2021-752e807fdd has been pushed to the Fedora 34 stable repository. If problem still persists, please make note of it in this bug report. |