Version-Release number of selected component: papi-testsuite-6.0.0-7.fc34 Additional info: reporter: libreport-2.15.2 backtrace_rating: 4 cgroup: 0::/user.slice/user-1000.slice/user/app.slice/app-org.gnome.Terminal.slice/vte-spawn-e72706b1-43bc-49a4-94ab-465a9eb697ef.scope cmdline: ./ctests/all_native_events TESTS_QUIET crash_function: _papi_hwi_cleanup_eventset executable: /usr/share/papi/ctests/all_native_events journald_cursor: s=68e38ebf976a4c7b9d23c531e3fbc082;i=11845;b=550b8228d45a43918585e9682c133605;m=ca45a962;t=5c9d9da9e2e5d;x=33c4cfa5f349032 kernel: 5.13.9-200.fc34.x86_64 rootdir: / runlevel: N 5 type: CCpp uid: 0 Truncated backtrace: Thread no. 1 (4 frames) #6 _papi_hwi_cleanup_eventset at papi_internal.c:1825 #7 PAPI_cleanup_eventset at papi.c:3491 #8 PAPI_shutdown at papi.c:5035 #9 test_pass at test_utils.c:459
Created attachment 1826249 [details] File: backtrace
Created attachment 1826250 [details] File: core_backtrace
Created attachment 1826251 [details] File: cpuinfo
Created attachment 1826252 [details] File: dso_list
Created attachment 1826253 [details] File: environ
Created attachment 1826254 [details] File: limits
Created attachment 1826255 [details] File: maps
Created attachment 1826256 [details] File: mountinfo
Created attachment 1826257 [details] File: open_fds
Created attachment 1826258 [details] File: proc_pid_status
I suspect there is something related to the specific processor implementation triggering a problem. I don't have access to an AMD Ryzen 9 3900X machine. Could you run the test under valgrind and report the results: valgrind ./ctests/all_native_events TESTS_QUIET Also provide the results of: papi_avail -a
papi_avail -a output: aAvailable PAPI preset and user defined events plus hardware information. -------------------------------------------------------------------------------- PAPI version : 6.0.0.0 Operating system : Linux 5.14.11-200.fc34.x86_64 Vendor string and code : AuthenticAMD (2, 0x2) Model string and code : AMD Ryzen 9 3900X 12-Core Processor (113, 0x71) CPU revision : 0.000000 CPUID : Family/Model/Stepping 23/113/0, 0x17/0x71/0x00 CPU Max MHz : 4672 CPU Min MHz : 2200 Total cores : 24 SMT threads per core : 2 Cores per socket : 12 Sockets : 1 Cores per NUMA region : 24 NUMA regions : 1 Running in a VM : no Number Hardware Counters : 5 Max Multiplex Counters : 384 Fast counter read (rdpmc): yes -------------------------------------------------------------------------------- ================================================================================ PAPI Preset Events ================================================================================ Name Code Deriv Description (Note) PAPI_L1_ICM 0x80000001 No Level 1 instruction cache misses PAPI_TLB_DM 0x80000014 No Data translation lookaside buffer misses PAPI_TLB_IM 0x80000015 Yes Instruction translation lookaside buffer misses PAPI_STL_ICY 0x80000025 No Cycles with no instruction issue PAPI_BR_TKN 0x8000002c No Conditional branch instructions taken PAPI_BR_MSP 0x8000002e No Conditional branch instructions mispredicted PAPI_TOT_INS 0x80000032 No Instructions completed PAPI_FP_INS 0x80000034 No Floating point instructions PAPI_BR_INS 0x80000037 No Branch instructions PAPI_VEC_INS 0x80000038 No Vector/SIMD instructions (could include integer) PAPI_TOT_CYC 0x8000003b No Total cycles PAPI_L1_DCA 0x80000040 No Level 1 data cache accesses PAPI_L1_ICH 0x80000049 Yes Level 1 instruction cache hits PAPI_L1_ICA 0x8000004c No Level 1 instruction cache accesses PAPI_L2_ICA 0x8000004d No Level 2 instruction cache accesses PAPI_L1_ICR 0x8000004f No Level 1 instruction cache reads PAPI_L1_TCA 0x80000058 Yes Level 1 total cache accesses PAPI_FML_INS 0x80000061 No Floating point multiply instructions PAPI_FAD_INS 0x80000062 No Floating point add instructions PAPI_FDV_INS 0x80000063 No Floating point divide instructions (Counts both divide and square root instructions) PAPI_FSQ_INS 0x80000064 No Floating point square root instructions (Counts both divide and square root instructions) PAPI_FP_OPS 0x80000066 No Floating point operations PAPI_SP_OPS 0x80000067 No Floating point operations; optimized to count scaled single precision vector operations PAPI_DP_OPS 0x80000068 No Floating point operations; optimized to count scaled double precision vector operations -------------------------------------------------------------------------------- Of 24 available events, 3 are derived. And here is valgrind output, unfortunately it looks like it doesn't know how to emulate rdpmc: valgrind ./ctests/all_native_events ==64949== Memcheck, a memory error detector ==64949== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al. ==64949== Using Valgrind-3.17.0 and LibVEX; rerun with -h for copyright info ==64949== Command: ./ctests/all_native_events ==64949== PAPI Error: Couldn't open hw_instructions in exclude_guest=0 test Test case ALL_NATIVE_EVENTS: Available native events and hardware information. vex amd64->IR: unhandled instruction bytes: 0xF 0x33 0x8B 0x4E 0x8 0x44 0x39 0xD1 0xF 0x84 vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=0F vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0 ==64949== valgrind: Unrecognised instruction at address 0x48993c7. ==64949== at 0x48993C7: rdpmc (perf_helpers.h:48) ==64949== by 0x48993C7: mmap_read_self (perf_helpers.h:130) ==64949== by 0x48993C7: _pe_rdpmc_read (perf_event.c:1115) ==64949== by 0x48993C7: _pe_read (perf_event.c:1281) ==64949== by 0x48862D8: _papi_hwi_read (papi_internal.c:1710) ==64949== by 0x4880BB8: PAPI_stop (papi.c:2888) ==64949== by 0x10A7DC: ??? (in /usr/share/papi/ctests/all_native_events) ==64949== by 0x10A9AF: ??? (in /usr/share/papi/ctests/all_native_events) ==64949== by 0x491CB74: (below main) (libc-start.c:332) ==64949== Your program just tried to execute an instruction that Valgrind ==64949== did not recognise. There are two possible reasons for this. ==64949== 1. Your program has a bug and erroneously jumped to a non-code ==64949== location. If you are running Memcheck and you just saw a ==64949== warning about a bad jump, it's probably your program's fault. ==64949== 2. The instruction is legitimate but Valgrind doesn't handle it, ==64949== i.e. it's Valgrind's fault. If you think this is the case or ==64949== you are not sure, please let us know and we'll try to fix it. ==64949== Either way, Valgrind will now raise a SIGILL signal which will ==64949== probably kill your program. ==64949== ==64949== Process terminating with default action of signal 4 (SIGILL): dumping core ==64949== Illegal opcode at address 0x48993C7 ==64949== at 0x48993C7: rdpmc (perf_helpers.h:48) ==64949== by 0x48993C7: mmap_read_self (perf_helpers.h:130) ==64949== by 0x48993C7: _pe_rdpmc_read (perf_event.c:1115) ==64949== by 0x48993C7: _pe_read (perf_event.c:1281) ==64949== by 0x48862D8: _papi_hwi_read (papi_internal.c:1710) ==64949== by 0x4880BB8: PAPI_stop (papi.c:2888) ==64949== by 0x10A7DC: ??? (in /usr/share/papi/ctests/all_native_events) ==64949== by 0x10A9AF: ??? (in /usr/share/papi/ctests/all_native_events) ==64949== by 0x491CB74: (below main) (libc-start.c:332) ==64949== ==64949== HEAP SUMMARY: ==64949== in use at exit: 465,593 bytes in 592 blocks ==64949== total heap usage: 3,404 allocs, 2,812 frees, 3,474,443 bytes allocated ==64949== ==64949== LEAK SUMMARY: ==64949== definitely lost: 3,789 bytes in 42 blocks ==64949== indirectly lost: 0 bytes in 0 blocks ==64949== possibly lost: 0 bytes in 0 blocks ==64949== still reachable: 461,804 bytes in 550 blocks ==64949== suppressed: 0 bytes in 0 blocks ==64949== Rerun with --leak-check=full to see details of leaked memory ==64949== ==64949== For lists of detected and suppressed errors, rerun with: -s ==64949== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) [1] 64949 illegal hardware instruction (core dumped) valgrind ./ctests/all_native_events However I suspect this is related to the stealtime bug: https://bugzilla.redhat.com/show_bug.cgi?id=2007882 and https://bugzilla.redhat.com/show_bug.cgi?id=2007877 which appears to be a problem with the initialization order (try to access an array that got allocated with size 0)
Backported following patch that addresses the problem commit 3625bdbad9fd57d1cdb1e5615854545167d4adcb Author: Anthony Castaldo <TonyCastaldo.edu> Date: Wed Aug 26 17:18:29 2020 -0400 This modifies PAPI_library_init() to initialize components in two classes, separated by the initialization of the papi thread structure. The first class is those that need no thread structure, currently everything but perf_event and perf_event_uncore. Following the init of the threading structure, we init the second class (perf_event and perf_event_uncore) that DOES need the thread structure to successfully init_component(). This required a change to _papi_hwi_init_global(), to add an argument to distinguish which class it should initialize.
FEDORA-2021-752e807fdd has been pushed to the Fedora 34 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2021-752e807fdd` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2021-752e807fdd See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
FEDORA-2021-752e807fdd has been pushed to the Fedora 34 stable repository. If problem still persists, please make note of it in this bug report.