Bug 908836

Summary: libvirt: wrong cpu topology - AMD Bulldozer 62XX familly
Product: Red Hat Enterprise Linux 6 Reporter: Chris Pelland <cpelland>
Component: libvirtAssignee: Peter Krempa <pkrempa>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 6.4CC: acathrow, asegundo, bazulay, berrange, bsarathy, cpelland, dallan, danken, dougsland, dyasny, dyuan, honzhang, iheim, jwest, ltroan, mzhan, pkrempa, rwu
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-0.10.2-18.el6_4.1 Doc Type: Bug Fix
Doc Text:
Cause: The AMD Bulldozer CPU architecture consists of so-called "modules". These are represented both as separate cores and separate threads. Management applications need to choose between one of the approaches. Libvirt wasn't providing enough information to do this. Consequence: Management applications weren't able to represent the modules in a bulldozer core according to their needs. Fix: The capabilities XML output now contains more information about the processor topology so that the management apps can extract the information they need.
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-03-21 14:05:20 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 888503    
Bug Blocks: 907587    

Description Chris Pelland 2013-02-07 16:18:07 UTC
This bug has been copied from bug #888503 and has been proposed
to be backported to 6.4 z-stream (EUS).

Comment 8 hongming 2013-03-18 08:42:44 UTC
Verify it as follows. The result is expected. Move its status to VERIFIED. 

Versions

# rpm -q libvirt
libvirt-0.10.2-18.el6_4.1.x86_64

# cat /proc/cpuinfo|grep "model name"|uniq
model name	: AMD Opteron(tm) Processor 6282 SE  


# numactl --hardware
available: 8 nodes (0-7)
node 0 cpus: 0 4 8 12 16 20 24 28
node 0 size: 16349 MB
node 0 free: 15389 MB
node 1 cpus: 32 36 40 44 48 52 56 60
node 1 size: 16384 MB
node 1 free: 15976 MB
node 2 cpus: 1 5 9 13 17 21 25 29
node 2 size: 16384 MB
node 2 free: 15753 MB
node 3 cpus: 33 37 41 45 49 53 57 61
node 3 size: 16384 MB
node 3 free: 15543 MB
node 4 cpus: 2 6 10 14 18 22 26 30
node 4 size: 16384 MB
node 4 free: 15925 MB
node 5 cpus: 34 38 42 46 50 54 58 62
node 5 size: 16384 MB
node 5 free: 15919 MB
node 6 cpus: 35 39 43 47 51 55 59 63
node 6 size: 16384 MB
node 6 free: 15817 MB
node 7 cpus: 3 7 11 15 19 23 27 31
node 7 size: 16367 MB
node 7 free: 15583 MB
node distances:
node   0   1   2   3   4   5   6   7 
  0:  10  16  16  22  16  16  22  22 
  1:  16  10  16  22  22  22  16  22 
  2:  16  16  10  16  22  22  22  16 
  3:  22  22  16  10  22  16  22  16 
  4:  16  22  22  22  10  16  16  16 
  5:  16  22  22  16  16  10  22  22 
  6:  22  16  22  22  16  22  10  16 
  7:  22  22  16  16  16  22  16  10 


# virsh nodeinfo
CPU model:           x86_64
CPU(s):              64
CPU frequency:       2593 MHz
CPU socket(s):       1
Core(s) per socket:  64
Thread(s) per core:  1
NUMA cell(s):        1
Memory size:         132035680 KiB

# virsh capabilities
<capabilities>
  ......
  <host>
  ......
    <topology>
      <cells num='8'>
        <cell id='0'>
          <cpus num='8'>
            <cpu id='0' socket_id='0' core_id='0' siblings='0,4'/>
            <cpu id='4' socket_id='0' core_id='1' siblings='0,4'/>
            <cpu id='8' socket_id='0' core_id='2' siblings='8,12'/>
            <cpu id='12' socket_id='0' core_id='3' siblings='8,12'/>
            <cpu id='16' socket_id='0' core_id='4' siblings='16,20'/>
            <cpu id='20' socket_id='0' core_id='5' siblings='16,20'/>
            <cpu id='24' socket_id='0' core_id='6' siblings='24,28'/>
            <cpu id='28' socket_id='0' core_id='7' siblings='24,28'/>
          </cpus>
        </cell>
        <cell id='1'>
          <cpus num='8'>
            <cpu id='32' socket_id='0' core_id='0' siblings='32,36'/>
            <cpu id='36' socket_id='0' core_id='1' siblings='32,36'/>
            <cpu id='40' socket_id='0' core_id='2' siblings='40,44'/>
            <cpu id='44' socket_id='0' core_id='3' siblings='40,44'/>
            <cpu id='48' socket_id='0' core_id='4' siblings='48,52'/>
            <cpu id='52' socket_id='0' core_id='5' siblings='48,52'/>
            <cpu id='56' socket_id='0' core_id='6' siblings='56,60'/>
            <cpu id='60' socket_id='0' core_id='7' siblings='56,60'/>
          </cpus>
        </cell>
        <cell id='2'>
          <cpus num='8'>
            <cpu id='1' socket_id='1' core_id='0' siblings='1,5'/>
            <cpu id='5' socket_id='1' core_id='1' siblings='1,5'/>
            <cpu id='9' socket_id='1' core_id='2' siblings='9,13'/>
            <cpu id='13' socket_id='1' core_id='3' siblings='9,13'/>
            <cpu id='17' socket_id='1' core_id='4' siblings='17,21'/>
            <cpu id='21' socket_id='1' core_id='5' siblings='17,21'/>
            <cpu id='25' socket_id='1' core_id='6' siblings='25,29'/>
            <cpu id='29' socket_id='1' core_id='7' siblings='25,29'/>
          </cpus>
        </cell>
        <cell id='3'>
          <cpus num='8'>
            <cpu id='33' socket_id='1' core_id='0' siblings='33,37'/>
            <cpu id='37' socket_id='1' core_id='1' siblings='33,37'/>
            <cpu id='41' socket_id='1' core_id='2' siblings='41,45'/>
            <cpu id='45' socket_id='1' core_id='3' siblings='41,45'/>
            <cpu id='49' socket_id='1' core_id='4' siblings='49,53'/>
            <cpu id='53' socket_id='1' core_id='5' siblings='49,53'/>
            <cpu id='57' socket_id='1' core_id='6' siblings='57,61'/>
            <cpu id='61' socket_id='1' core_id='7' siblings='57,61'/>
          </cpus>
        </cell>
        <cell id='4'>
          <cpus num='8'>
            <cpu id='2' socket_id='2' core_id='0' siblings='2,6'/>
            <cpu id='6' socket_id='2' core_id='1' siblings='2,6'/>
            <cpu id='10' socket_id='2' core_id='2' siblings='10,14'/>
            <cpu id='14' socket_id='2' core_id='3' siblings='10,14'/>
            <cpu id='18' socket_id='2' core_id='4' siblings='18,22'/>
            <cpu id='22' socket_id='2' core_id='5' siblings='18,22'/>
            <cpu id='26' socket_id='2' core_id='6' siblings='26,30'/>
            <cpu id='30' socket_id='2' core_id='7' siblings='26,30'/>
          </cpus>
        </cell>
        <cell id='5'>
          <cpus num='8'>
            <cpu id='34' socket_id='2' core_id='0' siblings='34,38'/>
            <cpu id='38' socket_id='2' core_id='1' siblings='34,38'/>
            <cpu id='42' socket_id='2' core_id='2' siblings='42,46'/>
            <cpu id='46' socket_id='2' core_id='3' siblings='42,46'/>
            <cpu id='50' socket_id='2' core_id='4' siblings='50,54'/>
            <cpu id='54' socket_id='2' core_id='5' siblings='50,54'/>
            <cpu id='58' socket_id='2' core_id='6' siblings='58,62'/>
            <cpu id='62' socket_id='2' core_id='7' siblings='58,62'/>
          </cpus>
        </cell>
        <cell id='6'>
          <cpus num='8'>
            <cpu id='35' socket_id='3' core_id='0' siblings='35,39'/>
            <cpu id='39' socket_id='3' core_id='1' siblings='35,39'/>
            <cpu id='43' socket_id='3' core_id='2' siblings='43,47'/>
            <cpu id='47' socket_id='3' core_id='3' siblings='43,47'/>
            <cpu id='51' socket_id='3' core_id='4' siblings='51,55'/>
            <cpu id='55' socket_id='3' core_id='5' siblings='51,55'/>
            <cpu id='59' socket_id='3' core_id='6' siblings='59,63'/>
            <cpu id='63' socket_id='3' core_id='7' siblings='59,63'/>
          </cpus>
        </cell>
        <cell id='7'>
          <cpus num='8'>
            <cpu id='3' socket_id='3' core_id='0' siblings='3,7'/>
            <cpu id='7' socket_id='3' core_id='1' siblings='3,7'/>
            <cpu id='11' socket_id='3' core_id='2' siblings='11,15'/>
            <cpu id='15' socket_id='3' core_id='3' siblings='11,15'/>
            <cpu id='19' socket_id='3' core_id='4' siblings='19,23'/>
            <cpu id='23' socket_id='3' core_id='5' siblings='19,23'/>
            <cpu id='27' socket_id='3' core_id='6' siblings='27,31'/>
            <cpu id='31' socket_id='3' core_id='7' siblings='27,31'/>
          </cpus>
        </cell>
      </cells>
    </topology>
   ......
  </host>
  ......
</capabilities>

# ./numatest.py
Sockets: 4
Cores: 32
Threads: 64



# cat numatest.py
#!/usr/bin/python
# Amador Pahim <apahim>
# Jan 12 2013

import libvirt
from xml.dom import minidom

c = libvirt.open("qemu:///system")
caps = minidom.parseString(c.getCapabilities())
#caps = minidom.parse("capabilities.xml")

host = caps.getElementsByTagName('host')[0]
cells = host.getElementsByTagName('cells')[0]
total_cpus = cells.getElementsByTagName('cpu').length

socketIds = []
siblingsIds = []

socketIds = [ proc.getAttribute('socket_id')
              for proc in cells.getElementsByTagName('cpu')
              if proc.getAttribute('socket_id') not in socketIds ]

siblingsIds = [ proc.getAttribute('siblings')
                for proc in cells.getElementsByTagName('cpu')
                if proc.getAttribute('siblings') not in siblingsIds ]

print "Sockets:",len(set(socketIds))
print "Cores:",len(set(siblingsIds))
print "Threads:",total_cpus

Comment 10 errata-xmlrpc 2013-03-21 14:05:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-0664.html

Comment 13 Dave Allan 2013-04-04 18:15:37 UTC
Peter, do we have tests for this BZ in make check?

Comment 14 Peter Krempa 2013-04-04 19:13:17 UTC
No, there are no automated tests for that code. Unfortunately the code partly depends on data provided by libnuma and we currently don't have a way how to trick it into reading test datasets.