Bug 874050 - virsh nodeinfo can't get the right info on AMD Bulldozer cpu
virsh nodeinfo can't get the right info on AMD Bulldozer cpu
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: libvirt (Show other bugs)
6.4
x86_64 Linux
high Severity medium
: rc
: ---
Assigned To: Peter Krempa
Virtualization Bugs
: ZStream
Depends On:
Blocks: 833425 877024 881827
  Show dependency treegraph
 
Reported: 2012-11-07 06:27 EST by Wayne Sun
Modified: 2013-02-21 02:26 EST (History)
10 users (show)

See Also:
Fixed In Version: libvirt-0.10.2-9.el6
Doc Type: Bug Fix
Doc Text:
The AMD Bulldozer architecture consists of "modules" which are reported by the kernel as both threads and cores. Libvirt's processor topology detection code wasn't able to detect this properly thus libvirt reported twice the actual number of processors. This issue was fixed by reporting a topology that adds up to the total number of processors reported in the system but the actual topology has to be checked in output of virCapabilities() (virsh capabilities). Also the fallback output was documented. Additionally the users should be instructed to use the capability output for topology detection purposes due to performance reasons. NUMA topology has the important impact performance-wise but the physical topology can differ from that.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-02-21 02:26:06 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
sysfs dump info (51.44 KB, application/x-bzip)
2012-11-07 06:27 EST, Wayne Sun
no flags Details
cpuinfo (65.46 KB, text/plain)
2012-11-07 21:36 EST, Wayne Sun
no flags Details

  None (edit)
Description Wayne Sun 2012-11-07 06:27:37 EST
Created attachment 639976 [details]
sysfs dump info

Description of problem:
On a host with AMD 6200 series cpu, which is AMD "Interlagos" platform, consist two MCM (Multi-Chip Module) with 4 "Bulldozer" modules each, total 8 "Bulldozer" modules, virsh nodeinfo collect wrong info with threads and then the cpu total number not match.

Detail Bulldozer info:
http://en.wikipedia.org/wiki/Bulldozer_(microarchitecture)

the sysfs device info is attached.

The problem is at parsing the thread numbers, the total CPU number should be 64 while nodeinfo shows will be 128(8*2*8). "threads" are somewhat between a core and a thread, they have separate core ID's and separate thread ID's, it also have the thread_siblings parameter filled, that might be the cause.

Version-Release number of selected component (if applicable):
libvirt-0.10.2-7.el6.x86_64
qemu-kvm-0.12.1.2-2.295.el6.x86_64
kernel-2.6.32-279.el6.x86_64


How reproducible:
always

Steps to Reproduce:
1.
# cat /proc/cpuinfo |grep "model name"|tail -1
model name	: AMD Opteron(tm) Processor 6282 SE  

# numactl --hardware
available: 8 nodes (0-7)
node 0 cpus: 0 4 8 12 16 20 24 28
node 0 size: 16349 MB
node 0 free: 15596 MB
node 1 cpus: 32 36 40 44 48 52 56 60
node 1 size: 16384 MB
node 1 free: 15931 MB
node 2 cpus: 1 5 9 13 17 21 25 29
node 2 size: 16384 MB
node 2 free: 15871 MB
node 3 cpus: 33 37 41 45 49 53 57 61
node 3 size: 16384 MB
node 3 free: 15845 MB
node 4 cpus: 2 6 10 14 18 22 26 30
node 4 size: 16384 MB
node 4 free: 15811 MB
node 5 cpus: 34 38 42 46 50 54 58 62
node 5 size: 16384 MB
node 5 free: 15917 MB
node 6 cpus: 35 39 43 47 51 55 59 63
node 6 size: 16384 MB
node 6 free: 15855 MB
node 7 cpus: 3 7 11 15 19 23 27 31
node 7 size: 16367 MB
node 7 free: 15869 MB
node distances:
node   0   1   2   3   4   5   6   7 
  0:  10  20  20  20  20  20  20  20 
  1:  20  10  20  20  20  20  20  20 
  2:  20  20  10  20  20  20  20  20 
  3:  20  20  20  10  20  20  20  20 
  4:  20  20  20  20  10  20  20  20 
  5:  20  20  20  20  20  10  20  20 
  6:  20  20  20  20  20  20  10  20 
  7:  20  20  20  20  20  20  20  10 

2.
# virsh nodeinfo
CPU model:           x86_64
CPU(s):              64
CPU frequency:       2593 MHz
CPU socket(s):       1
Core(s) per socket:  8
Thread(s) per core:  2
NUMA cell(s):        8
Memory size:         132101788 KiB

3.
  
Actual results:
nodeinfo is not right

Expected results:
nodeinfo output should be right

Additional info:
Comment 2 Wayne Sun 2012-11-07 21:36:07 EST
Created attachment 640518 [details]
cpuinfo

/proc/cpuinfo is attached
Comment 3 Peter Krempa 2012-11-08 18:26:38 EST
Fix/workaround proposed upstream: http://www.redhat.com/archives/libvir-list/2012-November/msg00365.html
Comment 4 Peter Krempa 2012-11-12 18:44:31 EST
Fixed upstream:

commit 7a791677b0e6cc3ae45aafdbca732f0f7ce05cbf
Author: Peter Krempa <pkrempa@redhat.com>
Date:   Wed Nov 7 15:50:56 2012 +0100

    nodeinfotest: Add test data from a AMD bulldozer machine.
    
    The AMD Bulldozer architecture uses so called "Clustered integer core
    modules" that count both as threads and cores. This patch expects the
    cpu to be detected using the new fallback condition otherwise twice the
    number of processors would be detected.

commit 86748976f18423c359e94294bd57df9fd9d98ce4
Author: Peter Krempa <pkrempa@redhat.com>
Date:   Wed Nov 7 15:19:47 2012 +0100

    nodeinfotest: Add test data for 2 processor host with broken NUMA
    
    This test data was gathered on an AMD MagnyCours machine that reports it
    has only one NUMA node although the hardware is consisting of 4. As
    duplicate core id's are ignored the reported topology was bogous. This
    should be fixed by the previous patch.
    
    Reported and data provided by George-Cristian Bîrzan.

commit 9576afd110b8c3edeb65f9b39448884763ca68bd
Author: Peter Krempa <pkrempa@redhat.com>
Date:   Wed Nov 7 14:53:36 2012 +0100

    nodeinfo: Add check and workaround to guarantee valid cpu topologies
    
    Lately there were a few reports of the output of the virsh nodeinfo
    command being inaccurate. This patch tries to avoid that by checking if
    the topology actually makes sense. If it doesn't we then report a
    synthetic topology that indicates to the user that the host capabilities
    should be checked for the actual topology.
Comment 7 Eric Blake 2012-11-15 10:52:07 EST
Should we move this back to ASSIGNED to also take in Viktor's upstream improvements?
https://www.redhat.com/archives/libvir-list/2012-November/msg00572.html
Comment 9 Wayne Sun 2012-11-20 04:38:44 EST
# rpm -q libvirt qemu-kvm kernel
libvirt-0.10.2-9.el6.x86_64
qemu-kvm-0.12.1.2-2.295.el6.x86_64
kernel-2.6.32-279.el6.x86_64

On the same box as in description:

# virsh nodeinfo
CPU model:           x86_64
CPU(s):              64
CPU frequency:       2593 MHz
CPU socket(s):       1
Core(s) per socket:  64
Thread(s) per core:  1
NUMA cell(s):        1
Memory size:         132101788 KiB

I did not do detail check of all the patches yet, but nodeinfo still fail to show the right info. Threads per core might be right now, but host is 8 nodes not 1, and cores per socket should be 8 not 64 as i think. 

Hi Peter, What you think?
Comment 10 Peter Krempa 2012-11-20 05:25:35 EST
In case of unusual NUMA machines where we can't accurately detect the topology of the processor the data reported in the virNodeInfo structure is modified to correctly report the maximum number of processors in the host. The modification is done according to this documentation:

nodes: the number of NUMA cell, 1 for unusual NUMA topologies or uniform memory access; check capabilities XML for the actual NUMA topology
sockets: number of CPU sockets per node if nodes > 1, 1 in case of unusual NUMA topology
cores: number of cores per socket, total number of processors in case of unusual NUMA topology
threads: number of threads per core, 1 in case of unusual numa topology
Comment 11 Wayne Sun 2012-11-20 22:11:45 EST
Since patches in comment #7 is not included, no check on them. Peter emphasise what nodeinfo will act on unusual NUMA machines in comment #10, so the result in comment #9 is expected now. 
So, this is fixed. Also test on 1 usual NUMA box and an non-NUMA box, works fine.
Comment 12 errata-xmlrpc 2013-02-21 02:26:06 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-0276.html

Note You need to log in before you can comment on or make changes to this bug.