Bug 1474999
Summary: | [LLNL 7.5 FEAT] Access to Broadwell Uncore Counters | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Trent D'Hooge <tdhooge> |
Component: | libpfm | Assignee: | William Cohen <wcohen> |
Status: | CLOSED ERRATA | QA Contact: | Michael Petlan <mpetlan> |
Severity: | high | Docs Contact: | Vladimír Slávik <vslavik> |
Priority: | high | ||
Version: | 7.3 | CC: | achen35, bgollahe, brolley, fche, lberk, mbenitez, mcermak, mgoodwin, nathans, tgummels, vslavik, wcohen, woodard |
Target Milestone: | rc | Keywords: | FutureFeature |
Target Release: | 7.5 | ||
Hardware: | x86_64 | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | libpfm-4.7.0-6.el7 | Doc Type: | Enhancement |
Doc Text: |
Support for Intel Xeon v4 uncore performance events in *libpfm*, *pcp*, and *papi*
This update adds support for Intel Xeon v4 uncore performance events to the *libpfm* performance monitoring library, the *pcp* tool, and the *papi* interface.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2018-04-10 14:25:27 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1400611, 1461180, 1522983 |
Description
Trent D'Hooge
2017-07-25 19:58:27 UTC
No PCP changes are required here - it'll need a patch/rebase to libpfm AIUI. Details of the perfmon2/libpfm4 commit mentioned earlier... commit 488227bf2128e8b80f9b7573869fe33fcbd63342 Author: Stephane Eranian <eranian> Date: Fri Jun 2 12:09:31 2017 -0700 add Intel Broadwell server uncore PMUs support This patch adds Intel Broadwell Server (model 79, 86) uncore PMU support. It adds the following PMUs: - IMC - CBOX - HA - UBOX - SBOX - IRP - PCU - QPI - R2PCIE - R3QPI Based on Broadwell Server Uncore Performance Monitoring Reference Manual available here: http://www.intel.com/content/www/us/en/processors/xeon/xeon-e5-e7-v4-uncore-performance-monitoring.html Signed-off-by: Stephane Eranian <eranian> BTW you may want to start working on Skylake EP or Purley or Intel Xeon Platinum, Gold, Silver or whatever they they are calling it these days. They are going to need this too. The Intel Broadwell server uncore PMUs support applies cleanly. However, need the rhel-7.5 (and related) flag(s) set. The Broadwell uncore patches are in libpfm-4.7.0-6.el7 Yes the release notes should mention the addition of the broadwell uncore libpfm support in the release notes. Developers doing performance analysis using pcp, libpfm, or papi would be like to know if they can access broadwell uncore performance events. This release note info looks reasonable. Tested with libpfm-4.7.0-9.el7.x86_64. It knows bdw_unc* PMUs, compared to libpfm-4.7.0-4.el7.x86_64 which didn't know them. # showevtinfo | grep -e bdw -e bdx [155, bdw, "Intel Broadwell"] [201, bdw_ep, "Intel Broadwell EP"] [269, bdx_unc_cbo0, "Intel BroadwellX C-Box 0 uncore"] [270, bdx_unc_cbo1, "Intel BroadwellX C-Box 1 uncore"] [271, bdx_unc_cbo2, "Intel BroadwellX C-Box 2 uncore"] [272, bdx_unc_cbo3, "Intel BroadwellX C-Box 3 uncore"] [273, bdx_unc_cbo4, "Intel BroadwellX C-Box 4 uncore"] [274, bdx_unc_cbo5, "Intel BroadwellX C-Box 5 uncore"] [275, bdx_unc_cbo6, "Intel BroadwellX C-Box 6 uncore"] [276, bdx_unc_cbo7, "Intel BroadwellX C-Box 7 uncore"] [277, bdx_unc_cbo8, "Intel BroadwellX C-Box 8 uncore"] [278, bdx_unc_cbo9, "Intel BroadwellX C-Box 9 uncore"] [279, bdx_unc_cbo10, "Intel BroadwellX C-Box 10 uncore"] [280, bdx_unc_cbo11, "Intel BroadwellX C-Box 11 uncore"] [281, bdx_unc_cbo12, "Intel BroadwellX C-Box 12 uncore"] [282, bdx_unc_cbo13, "Intel BroadwellX C-Box 13 uncore"] [283, bdx_unc_cbo14, "Intel BroadwellX C-Box 14 uncore"] [284, bdx_unc_cbo15, "Intel BroadwellX C-Box 15 uncore"] [285, bdx_unc_cbo16, "Intel BroadwellX C-Box 16 uncore"] [286, bdx_unc_cbo17, "Intel BroadwellX C-Box 17 uncore"] [287, bdx_unc_cbo18, "Intel BroadwellX C-Box 18 uncore"] [288, bdx_unc_cbo19, "Intel BroadwellX C-Box 19 uncore"] [289, bdx_unc_cbo20, "Intel BroadwellX C-Box 20 uncore"] [290, bdx_unc_cbo21, "Intel BroadwellX C-Box 21 uncore"] [291, bdx_unc_cbo22, "Intel BroadwellX C-Box 22 uncore"] [292, bdx_unc_cbo23, "Intel BroadwellX C-Box 23 uncore"] [293, bdx_unc_ha0, "Intel BroadwellX HA 0 uncore"] [294, bdx_unc_ha1, "Intel BroadwellX HA 1 uncore"] [295, bdx_unc_imc0, "Intel BroadwellX IMC0 uncore"] [296, bdx_unc_imc1, "Intel BroadwellX IMC1 uncore"] [297, bdx_unc_imc2, "Intel BroadwellX IMC2 uncore"] [298, bdx_unc_imc3, "Intel BroadwellX IMC3 uncore"] [299, bdx_unc_imc4, "Intel BroadwellX IMC4 uncore"] [300, bdx_unc_imc5, "Intel BroadwellX IMC5 uncore"] [301, bdx_unc_imc6, "Intel BroadwellX IMC6 uncore"] [302, bdx_unc_imc7, "Intel BroadwellX IMC7 uncore"] [303, bdx_unc_pcu, "Intel BroadwellX PCU uncore"] [304, bdx_unc_qpi0, "Intel BroadwellX QPI0 uncore"] [305, bdx_unc_qpi1, "Intel BroadwellX QPI1 uncore"] [306, bdx_unc_qpi2, "Intel BroadwellX QPI2 uncore"] [307, bdx_unc_ubo, "Intel BroadwellX U-Box uncore"] [308, bdx_unc_r2pcie, "Intel BroadwellX R2PCIe uncore"] [309, bdx_unc_r3qpi0, "Intel BroadwellX R3QPI0 uncore"] [310, bdx_unc_r3qpi1, "Intel BroadwellX R3QPI1 uncore"] [311, bdx_unc_r3qpi2, "Intel BroadwellX R3QPI2 uncore"] [312, bdx_unc_irp, "Intel BroadwellX IRP uncore"] [313, bdx_unc_sbo0, "Intel BroadwellX S-BOX0 uncore"] [314, bdx_unc_sbo1, "Intel BroadwellX S-BOX1 uncore"] [315, bdx_unc_sbo2, "Intel BroadwellX S-BOX2 uncore"] [316, bdx_unc_sbo3, "Intel BroadwellX S-BOX3 uncore"] However, I don't know how is this used/supported in pcp. When I run `pminfo | grep perf`, I don't see any perfevent.* entries. Right now the configuration file/var/lib/pcp/pmdas/perfevent/perfevent.conf is going to need to specify the events for each particular processor. This looks to be pretty duplicative of the information elsewhere. It would be nice to have pcp extract that information from "perf list" rather than having another encoding of it. > It would be nice to have pcp extract that information
FWIW the pcp papi.* pmda does extract the information dynamically, except at the PAPI layer rather than perf, and am not sure how well the uncore counters come through.
TESTING ON A BROADWELL-EP machine: # uname -r ; rpmquery pcp libpfm papi 3.10.0-826.el7.x86_64 pcp-3.11.8-7.el7.x86_64 libpfm-4.7.0-4.el7.x86_64 papi-5.2.0-23.el7.x86_64 # pminfo | grep perf perfevent.version perfevent.active perfevent.hwcounters.software.task_clock.dutycycle perfevent.hwcounters.software.task_clock.value perfevent.hwcounters.software.page_faults.dutycycle perfevent.hwcounters.software.page_faults.value perfevent.hwcounters.software.minor_faults.dutycycle perfevent.hwcounters.software.minor_faults.value perfevent.hwcounters.software.major_faults.dutycycle perfevent.hwcounters.software.major_faults.value perfevent.hwcounters.software.emulation_faults.dutycycle perfevent.hwcounters.software.emulation_faults.value perfevent.hwcounters.software.cpu_migrations.dutycycle perfevent.hwcounters.software.cpu_migrations.value perfevent.hwcounters.software.cpu_clock.dutycycle perfevent.hwcounters.software.cpu_clock.value perfevent.hwcounters.software.context_switches.dutycycle perfevent.hwcounters.software.context_switches.value perfevent.hwcounters.software.alignment_faults.dutycycle perfevent.hwcounters.software.alignment_faults.value perfevent.hwcounters.uncore_imc_4.clockticks.dutycycle perfevent.hwcounters.uncore_imc_4.clockticks.value perfevent.hwcounters.uncore_imc_4.cas_count_read.dutycycle perfevent.hwcounters.uncore_imc_4.cas_count_read.value perfevent.hwcounters.uncore_imc_4.cas_count_write.dutycycle perfevent.hwcounters.uncore_imc_4.cas_count_write.value perfevent.hwcounters.uncore_imc_3.clockticks.dutycycle perfevent.hwcounters.uncore_imc_3.clockticks.value perfevent.hwcounters.uncore_imc_3.cas_count_read.dutycycle perfevent.hwcounters.uncore_imc_3.cas_count_read.value perfevent.hwcounters.uncore_imc_3.cas_count_write.dutycycle perfevent.hwcounters.uncore_imc_3.cas_count_write.value perfevent.hwcounters.uncore_imc_2.clockticks.dutycycle perfevent.hwcounters.uncore_imc_2.clockticks.value perfevent.hwcounters.uncore_imc_2.cas_count_read.dutycycle perfevent.hwcounters.uncore_imc_2.cas_count_read.value perfevent.hwcounters.uncore_imc_2.cas_count_write.dutycycle perfevent.hwcounters.uncore_imc_2.cas_count_write.value perfevent.hwcounters.uncore_imc_1.clockticks.dutycycle perfevent.hwcounters.uncore_imc_1.clockticks.value perfevent.hwcounters.uncore_imc_1.cas_count_read.dutycycle perfevent.hwcounters.uncore_imc_1.cas_count_read.value perfevent.hwcounters.uncore_imc_1.cas_count_write.dutycycle perfevent.hwcounters.uncore_imc_1.cas_count_write.value perfevent.hwcounters.uncore_imc_0.clockticks.dutycycle perfevent.hwcounters.uncore_imc_0.clockticks.value perfevent.hwcounters.uncore_imc_0.cas_count_read.dutycycle perfevent.hwcounters.uncore_imc_0.cas_count_read.value perfevent.hwcounters.uncore_imc_0.cas_count_write.dutycycle perfevent.hwcounters.uncore_imc_0.cas_count_write.value perfevent.hwcounters.power.energy_cores.dutycycle perfevent.hwcounters.power.energy_cores.value perfevent.hwcounters.power.energy_ram.dutycycle perfevent.hwcounters.power.energy_ram.value perfevent.hwcounters.power.energy_pkg.dutycycle perfevent.hwcounters.power.energy_pkg.value perfevent.hwcounters.msr.mperf.dutycycle perfevent.hwcounters.msr.mperf.value perfevent.hwcounters.msr.aperf.dutycycle perfevent.hwcounters.msr.aperf.value perfevent.hwcounters.msr.tsc.dutycycle perfevent.hwcounters.msr.tsc.value perfevent.hwcounters.msr.smi.dutycycle perfevent.hwcounters.msr.smi.value perfevent.hwcounters.cpu.cpu_cycles.dutycycle perfevent.hwcounters.cpu.cpu_cycles.value perfevent.hwcounters.cpu.tx_commit.dutycycle perfevent.hwcounters.cpu.tx_commit.value perfevent.hwcounters.cpu.instructions.dutycycle perfevent.hwcounters.cpu.instructions.value perfevent.hwcounters.cpu.cycles_t.dutycycle perfevent.hwcounters.cpu.cycles_t.value perfevent.hwcounters.cpu.cycles_ct.dutycycle perfevent.hwcounters.cpu.cycles_ct.value perfevent.hwcounters.cpu.branch_instructions.dutycycle perfevent.hwcounters.cpu.branch_instructions.value perfevent.hwcounters.cpu.mem_stores.dutycycle perfevent.hwcounters.cpu.mem_stores.value perfevent.hwcounters.cpu.cache_misses.dutycycle perfevent.hwcounters.cpu.cache_misses.value perfevent.hwcounters.cpu.ref_cycles.dutycycle perfevent.hwcounters.cpu.ref_cycles.value perfevent.hwcounters.cpu.bus_cycles.dutycycle perfevent.hwcounters.cpu.bus_cycles.value perfevent.hwcounters.cpu.el_start.dutycycle perfevent.hwcounters.cpu.el_start.value perfevent.hwcounters.cpu.el_abort.dutycycle perfevent.hwcounters.cpu.el_abort.value perfevent.hwcounters.cpu.cache_references.dutycycle perfevent.hwcounters.cpu.cache_references.value perfevent.hwcounters.cpu.el_conflict.dutycycle perfevent.hwcounters.cpu.el_conflict.value perfevent.hwcounters.cpu.el_capacity.dutycycle perfevent.hwcounters.cpu.el_capacity.value perfevent.hwcounters.cpu.tx_conflict.dutycycle perfevent.hwcounters.cpu.tx_conflict.value perfevent.hwcounters.cpu.tx_capacity.dutycycle perfevent.hwcounters.cpu.tx_capacity.value perfevent.hwcounters.cpu.tx_start.dutycycle perfevent.hwcounters.cpu.tx_start.value perfevent.hwcounters.cpu.tx_abort.dutycycle perfevent.hwcounters.cpu.tx_abort.value perfevent.hwcounters.cpu.el_commit.dutycycle perfevent.hwcounters.cpu.el_commit.value perfevent.hwcounters.cpu.mem_loads.dutycycle perfevent.hwcounters.cpu.mem_loads.value perfevent.hwcounters.cpu.branch_misses.dutycycle perfevent.hwcounters.cpu.branch_misses.value perfevent.hwcounters.UNHALTED_CORE_CYCLES.dutycycle perfevent.hwcounters.UNHALTED_CORE_CYCLES.value perfevent.hwcounters.INSTRUCTION_RETIRED.dutycycle perfevent.hwcounters.INSTRUCTION_RETIRED.value perfevent.hwcounters.UNHALTED_REFERENCE_CYCLES.dutycycle perfevent.hwcounters.UNHALTED_REFERENCE_CYCLES.value perfevent.hwcounters.LLC_MISSES.dutycycle perfevent.hwcounters.LLC_MISSES.value perfevent.derived.active So from my point I think that Broadwell Uncore counters are accessible to pcp now. VERIFIED. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:0812 |