Bug 1466735
Summary: | turbostat does report only half of CPUs on AMD with Opteron Processor 6276 | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Jiri Hladky <jhladky> | ||||||
Component: | cpupowerutils | Assignee: | Prarit Bhargava <prarit> | ||||||
Status: | CLOSED WONTFIX | QA Contact: | Erik Hamera <ehamera> | ||||||
Severity: | unspecified | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | 6.9 | CC: | jhladky, kkolakow, mpetlan | ||||||
Target Milestone: | rc | Keywords: | Reopened | ||||||
Target Release: | --- | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | |||||||||
: | 1466743 (view as bug list) | Environment: | |||||||
Last Closed: | 2017-12-06 10:59:16 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 1456386, 1466743 | ||||||||
Attachments: |
|
Description
Jiri Hladky
2017-06-30 10:59:53 UTC
I have informed upstream maintainer of turbostat about the issue Len Brown <lenb> Unless this gets fixed upstream very soon, I expect the fix won't make it into RHEL6 at all do we have an upstream commit on this yet? No, we don't. Looking at this, I think this is actually a strange topology reporting bug. The expectation for hyperthreaded processors is that for thread siblings, their core_id is the same. You can see this on my desktop processor: [nhorman@hmswarspite cpu]$ cat cpu0/topology/core_id 0 [nhorman@hmswarspite cpu]$ cat cpu0/topology/thread_siblings_list 0,4 [nhorman@hmswarspite cpu]$ cat cpu4/topology/core_id 0 [nhorman@hmswarspite cpu]$ cat cpu4/topology/thread_siblings_list 0,4 [nhorman@hmswarspite cpu]$ however on the opteron 6272 system I found in the lab, this is not the case: [root@hp-bl465cgen8-01 cpu]# cat cpu0/topology/core_id 0 [root@hp-bl465cgen8-01 cpu]# cat cpu0/topology/thread_siblings_list 0-1 [root@hp-bl465cgen8-01 cpu]# cat cpu1/topology/core_id 1 [root@hp-bl465cgen8-01 cpu]# cat cpu1/topology/thread_siblings_list 0-1 [root@hp-bl465cgen8-01 cpu]# The fact that the thread siblings are on different cores is somewhat non-sensical to the topolgy model in turbostat, and, really, in general. On this opteron, cpu8 is the other cpu that exists on core 0, and so the thread_siblings_list should read 0,8, not 0-1. Given that this information is derived from the apic id that a given cpu is assigned to (which in turn I believe is a firmware defined setting), I don't believe there is going to be anything we can do about this (though a firmware update may correct the problem). Hi Neil, I think that from the customer perspective it doesn't matter if this is a kernel bug (/sys/devices/system/cpu) or turbostat bug - customer just expects turbostat to report all CPUs correctly. Other tools can recognize the CPU topology correctly (for example lstopo) so IMHO this is a turbostat bug. If you think this is a kernel bug could you please open appropriate kernel bug so that we get it resolved? I have run /usr/bin/hwloc-gather-topology /tmp/$(uname -n) on this Opteron server https://beaker.cluster-qe.lab.eng.brq.redhat.com/bkr/view/kiff-02.cluster-qe.lab.eng.brq.redhat.com and I then verified that lstopo (both are parts of lstopo package) can correctly parse the topology from /sys/devices/system/cpu tar jxvf kiff-02.cluster-qe.lab.eng.brq.redhat.com.tar.bz2 lstopo --input kiff-02.cluster-qe.lab.eng.brq.redhat.com kiff-02.cluster-qe.lab.eng.brq.redhat.com.png Please check the PNG output. I will attach both kiff-02.cluster-qe.lab.eng.brq.redhat.com.tar.bz2 and kiff-02.cluster-qe.lab.eng.brq.redhat.com.png files for you to check. Based on this I believe that this is not a problem with information in /sys/devices/system/cpu but rather a turbostat problem. Thanks Jirka Created attachment 1296249 [details] Output of /usr/bin/hwloc-gather-topology /tmp/$(uname -n) Output of /usr/bin/hwloc-gather-topology /tmp/$(uname -n) on this AMD system: https://beaker.cluster-qe.lab.eng.brq.redhat.com/bkr/view/kiff-02.cluster-qe.lab.eng.brq.redhat.com Unpack it and use lstopo --input kiff-02.cluster-qe.lab.eng.brq.redhat.com to check the topology graphically. It shows all 32 CPUs. turbostat shows only half of CPUs: turbostat ls CPU Avg_MHz %Busy Bzy_MHz TSC_MHz - 29 3.92 749 1171 8 53 3.60 1462 2387 9 47 3.19 1463 2381 10 55 3.78 1449 2373 11 126 8.72 1443 2363 12 60 4.16 1439 2358 13 42 2.90 1442 2360 14 52 3.62 1439 2355 15 44 3.07 1438 2347 24 57 3.96 1430 2341 25 44 3.10 1423 2331 26 62 2.66 2343 2323 27 47 2.01 2328 2316 28 54 3.92 1384 2316 29 43 3.02 1418 2313 30 55 3.88 1409 2309 31 100 7.08 1410 2310 Created attachment 1296250 [details] CPU topology as displayed with lstopo Output of lstopo --input kiff-02.cluster-qe.lab.eng.brq.redhat.com kiff-02.cluster-qe.lab.eng.brq.redhat.com.png (Use it with kiff-02.cluster-qe.lab.eng.brq.redhat.com.tar.bz2 I have submitted earlier) lstopo can correctly recognize the CPU topology based on info in /sys/devices/system/cpu You're image in comment 9 illustrates the problem quite well. 1) You will note that on for each set of paired thread siblings (e.g PU 0 and PU 1), they are listed on separate cores. Processing units are by definition must be on the same core, otherwise they aren't siblings. 2) core id's within a single package are unique, and clearly lstopo is showing multiple same core ids within a single package. So its not working, its just acting broken in a different way than turbostat, based on an erroneous topology map as exported by the kernel. and the kernel in turn isn't generating that map, its just reporting it based on the ids that it reads from combinations of the cpuid instruction and apic registers, both of which are established by the system firmware. So theres nothing for us to do here. Either we write a quirk into the kernel to fix up maps for this system (if we can uniquely detect it), or we get the vendor to fix their firmware to configure the topology correctly. We're not going to do the former in RHEL6 at this late date, and so we're left with the latter, which is the proper fix anyway. (In reply to Neil Horman from comment #10) > You're image in comment 9 illustrates the problem quite well. > > 1) You will note that on for each set of paired thread siblings (e.g PU 0 > and PU 1), they are listed on separate cores. Processing units are by > definition must be on the same core, otherwise they aren't siblings. > > 2) core id's within a single package are unique, and clearly lstopo is > showing multiple same core ids within a single package. You're right ... but I'm wondering if the real bug here is that AMD 0x16 doesn't have a "Core" and has Processing Units. Those *are* unique AFAICT, and maybe that's the real problem here. /me thinks ... and will get back to everyone in a bit P. So this is the situation. The way turbostat handles the topology is not correct. First let's get the terminology right. There are processors, cores, and threads. The processor is the thingy that is stuck to the mobo. Cores are the thingys that _can_ execute code, but more modern processors have a pair of what they call threads to execute code for every Core. There's this other thing that can be used to group Cores together for processing power that is called a Node. It doesn't really matter why it exists but just note that it is only a way of grouping Cores together. In Intel's world, each Processor, Node (group of Cores), Core, and Thread are uniquely identified. In AMD's world, each Processor, Node, and Thread are uniquely identified with a number. This is NOT the case with Cores. In AMD's world, each Core is uniquely identified _within a Node_. And the results in an enumeration problem in turbostat. Turbostat enumerates assuming that each Core is a unique object -- but it isn't. This results in the overwriting of existing data in turbostat. So, for example, suppose we have the following simple topology: Processor 0 has Node 0, which contains Core 0 and Threads 0 and 1. Processor 0 has Node 1, which Contains Core 0 and Threads 2 and 3. When turbostat enumerates, it considers the *Core* as the main thing to enumerate, so it sets Core[0].thread_ids = { 0, 1 } Core[0].node = 0 Core[0].processor = 0 and then *overwrites that data* with Core[0].thread_ids = { 2, 3 } Core[0].node = 1 Core[0].processor = 0 .... which results in the first set of data being overwritten so we only see 1/2 of the data. This isn't an issue with NUMA, ACPI, etc. This is solely an issue with turbostat not handling enumeration of AMD 0x16 and 0x17 processors. Again, AFAICT what AMD is doing is valid; the threads are uniquely identifiable. The problem is how turbostat is enumerating the data. This can be fixed IMO but I'm going to have to think about the easiest way to do it. P. Hi Prarit, thanks a lot for the detailed analyses. I fully agree with it. Please note also that turbostat does not report all CPUs on wide range of AMD systems - including the Ryzen CPU - see BZ1454489. The issue is not server vendor specific. Jirka Red Hat Enterprise Linux 6 is in the Production 3 Phase. During the Production 3 Phase, Critical impact Security Advisories (RHSAs) and selected Urgent Priority Bug Fix Advisories (RHBAs) may be released as they become available. The official life cycle policy can be reviewed here: http://redhat.com/rhel/lifecycle This issue does not meet the inclusion criteria for the Production 3 Phase and will be marked as CLOSED/WONTFIX. If this remains a critical requirement, please contact Red Hat Customer Support to request a re-evaluation of the issue, citing a clear business justification. Note that a strong business justification will be required for re-evaluation. Red Hat Customer Support can be contacted via the Red Hat Customer Portal at the following URL: https://access.redhat.com/ |