RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 874050 - virsh nodeinfo can't get the right info on AMD Bulldozer cpu
Summary: virsh nodeinfo can't get the right info on AMD Bulldozer cpu
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: libvirt
Version: 6.4
Hardware: x86_64
OS: Linux
high
medium
Target Milestone: rc
: ---
Assignee: Peter Krempa
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks: 833425 877024 881827
TreeView+ depends on / blocked
 
Reported: 2012-11-07 11:27 UTC by Wayne Sun
Modified: 2013-02-21 07:26 UTC (History)
10 users (show)

Fixed In Version: libvirt-0.10.2-9.el6
Doc Type: Bug Fix
Doc Text:
The AMD Bulldozer architecture consists of "modules" which are reported by the kernel as both threads and cores. Libvirt's processor topology detection code wasn't able to detect this properly thus libvirt reported twice the actual number of processors. This issue was fixed by reporting a topology that adds up to the total number of processors reported in the system but the actual topology has to be checked in output of virCapabilities() (virsh capabilities). Also the fallback output was documented. Additionally the users should be instructed to use the capability output for topology detection purposes due to performance reasons. NUMA topology has the important impact performance-wise but the physical topology can differ from that.
Clone Of:
Environment:
Last Closed: 2013-02-21 07:26:06 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
sysfs dump info (51.44 KB, application/x-bzip)
2012-11-07 11:27 UTC, Wayne Sun
no flags Details
cpuinfo (65.46 KB, text/plain)
2012-11-08 02:36 UTC, Wayne Sun
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2013:0276 0 normal SHIPPED_LIVE Moderate: libvirt security, bug fix, and enhancement update 2013-02-20 21:18:26 UTC

Description Wayne Sun 2012-11-07 11:27:37 UTC
Created attachment 639976 [details]
sysfs dump info

Description of problem:
On a host with AMD 6200 series cpu, which is AMD "Interlagos" platform, consist two MCM (Multi-Chip Module) with 4 "Bulldozer" modules each, total 8 "Bulldozer" modules, virsh nodeinfo collect wrong info with threads and then the cpu total number not match.

Detail Bulldozer info:
http://en.wikipedia.org/wiki/Bulldozer_(microarchitecture)

the sysfs device info is attached.

The problem is at parsing the thread numbers, the total CPU number should be 64 while nodeinfo shows will be 128(8*2*8). "threads" are somewhat between a core and a thread, they have separate core ID's and separate thread ID's, it also have the thread_siblings parameter filled, that might be the cause.

Version-Release number of selected component (if applicable):
libvirt-0.10.2-7.el6.x86_64
qemu-kvm-0.12.1.2-2.295.el6.x86_64
kernel-2.6.32-279.el6.x86_64


How reproducible:
always

Steps to Reproduce:
1.
# cat /proc/cpuinfo |grep "model name"|tail -1
model name	: AMD Opteron(tm) Processor 6282 SE  

# numactl --hardware
available: 8 nodes (0-7)
node 0 cpus: 0 4 8 12 16 20 24 28
node 0 size: 16349 MB
node 0 free: 15596 MB
node 1 cpus: 32 36 40 44 48 52 56 60
node 1 size: 16384 MB
node 1 free: 15931 MB
node 2 cpus: 1 5 9 13 17 21 25 29
node 2 size: 16384 MB
node 2 free: 15871 MB
node 3 cpus: 33 37 41 45 49 53 57 61
node 3 size: 16384 MB
node 3 free: 15845 MB
node 4 cpus: 2 6 10 14 18 22 26 30
node 4 size: 16384 MB
node 4 free: 15811 MB
node 5 cpus: 34 38 42 46 50 54 58 62
node 5 size: 16384 MB
node 5 free: 15917 MB
node 6 cpus: 35 39 43 47 51 55 59 63
node 6 size: 16384 MB
node 6 free: 15855 MB
node 7 cpus: 3 7 11 15 19 23 27 31
node 7 size: 16367 MB
node 7 free: 15869 MB
node distances:
node   0   1   2   3   4   5   6   7 
  0:  10  20  20  20  20  20  20  20 
  1:  20  10  20  20  20  20  20  20 
  2:  20  20  10  20  20  20  20  20 
  3:  20  20  20  10  20  20  20  20 
  4:  20  20  20  20  10  20  20  20 
  5:  20  20  20  20  20  10  20  20 
  6:  20  20  20  20  20  20  10  20 
  7:  20  20  20  20  20  20  20  10 

2.
# virsh nodeinfo
CPU model:           x86_64
CPU(s):              64
CPU frequency:       2593 MHz
CPU socket(s):       1
Core(s) per socket:  8
Thread(s) per core:  2
NUMA cell(s):        8
Memory size:         132101788 KiB

3.
  
Actual results:
nodeinfo is not right

Expected results:
nodeinfo output should be right

Additional info:

Comment 2 Wayne Sun 2012-11-08 02:36:07 UTC
Created attachment 640518 [details]
cpuinfo

/proc/cpuinfo is attached

Comment 3 Peter Krempa 2012-11-08 23:26:38 UTC
Fix/workaround proposed upstream: http://www.redhat.com/archives/libvir-list/2012-November/msg00365.html

Comment 4 Peter Krempa 2012-11-12 23:44:31 UTC
Fixed upstream:

commit 7a791677b0e6cc3ae45aafdbca732f0f7ce05cbf
Author: Peter Krempa <pkrempa>
Date:   Wed Nov 7 15:50:56 2012 +0100

    nodeinfotest: Add test data from a AMD bulldozer machine.
    
    The AMD Bulldozer architecture uses so called "Clustered integer core
    modules" that count both as threads and cores. This patch expects the
    cpu to be detected using the new fallback condition otherwise twice the
    number of processors would be detected.

commit 86748976f18423c359e94294bd57df9fd9d98ce4
Author: Peter Krempa <pkrempa>
Date:   Wed Nov 7 15:19:47 2012 +0100

    nodeinfotest: Add test data for 2 processor host with broken NUMA
    
    This test data was gathered on an AMD MagnyCours machine that reports it
    has only one NUMA node although the hardware is consisting of 4. As
    duplicate core id's are ignored the reported topology was bogous. This
    should be fixed by the previous patch.
    
    Reported and data provided by George-Cristian Bîrzan.

commit 9576afd110b8c3edeb65f9b39448884763ca68bd
Author: Peter Krempa <pkrempa>
Date:   Wed Nov 7 14:53:36 2012 +0100

    nodeinfo: Add check and workaround to guarantee valid cpu topologies
    
    Lately there were a few reports of the output of the virsh nodeinfo
    command being inaccurate. This patch tries to avoid that by checking if
    the topology actually makes sense. If it doesn't we then report a
    synthetic topology that indicates to the user that the host capabilities
    should be checked for the actual topology.

Comment 7 Eric Blake 2012-11-15 15:52:07 UTC
Should we move this back to ASSIGNED to also take in Viktor's upstream improvements?
https://www.redhat.com/archives/libvir-list/2012-November/msg00572.html

Comment 9 Wayne Sun 2012-11-20 09:38:44 UTC
# rpm -q libvirt qemu-kvm kernel
libvirt-0.10.2-9.el6.x86_64
qemu-kvm-0.12.1.2-2.295.el6.x86_64
kernel-2.6.32-279.el6.x86_64

On the same box as in description:

# virsh nodeinfo
CPU model:           x86_64
CPU(s):              64
CPU frequency:       2593 MHz
CPU socket(s):       1
Core(s) per socket:  64
Thread(s) per core:  1
NUMA cell(s):        1
Memory size:         132101788 KiB

I did not do detail check of all the patches yet, but nodeinfo still fail to show the right info. Threads per core might be right now, but host is 8 nodes not 1, and cores per socket should be 8 not 64 as i think. 

Hi Peter, What you think?

Comment 10 Peter Krempa 2012-11-20 10:25:35 UTC
In case of unusual NUMA machines where we can't accurately detect the topology of the processor the data reported in the virNodeInfo structure is modified to correctly report the maximum number of processors in the host. The modification is done according to this documentation:

nodes: the number of NUMA cell, 1 for unusual NUMA topologies or uniform memory access; check capabilities XML for the actual NUMA topology
sockets: number of CPU sockets per node if nodes > 1, 1 in case of unusual NUMA topology
cores: number of cores per socket, total number of processors in case of unusual NUMA topology
threads: number of threads per core, 1 in case of unusual numa topology

Comment 11 Wayne Sun 2012-11-21 03:11:45 UTC
Since patches in comment #7 is not included, no check on them. Peter emphasise what nodeinfo will act on unusual NUMA machines in comment #10, so the result in comment #9 is expected now. 
So, this is fixed. Also test on 1 usual NUMA box and an non-NUMA box, works fine.

Comment 12 errata-xmlrpc 2013-02-21 07:26:06 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-0276.html


Note You need to log in before you can comment on or make changes to this bug.