Bug 879303

Summary: hotplug 160 vcpu to win2k8r2 guest, some vcpu are offline in guest
Product: Red Hat Enterprise Linux 6 Reporter: FuXiangChun <xfu>
Component: libvirtAssignee: Peter Krempa <pkrempa>
Status: CLOSED CANTFIX QA Contact: Virtualization Bugs <virt-bugs>
Severity: low Docs Contact:
Priority: low    
Version: 6.4CC: acathrow, areis, bsarathy, dyasny, dyuan, honzhang, jiahu, juzhang, mkenneth, virt-maint
Target Milestone: rcKeywords: Reopened
Target Release: ---   
Hardware: x86_64   
OS: Windows   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 969352 969354 (view as bug list) Environment:
Last Closed: 2013-05-31 09:52:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 969352, 969354    

Description FuXiangChun 2012-11-22 14:58:53 UTC
Description of problem:
boot guest win2k8r2 guest with the following scenarios and hotplug 160 vcpus.
     1.-smp 64,cores=4,thread=1,socket=40,maxcpus=160
     2.-smp 64,cores=1,thread=1,socket=1,maxcpus=160
     3.-smp 64,cores=1,thread=1,socket=64,maxcpus=160
scenario 1:
     119 cpus are online in guest
scenario 2:
     64 cpus are online in guest
scenario 3:
     119 cpus are online in guest

Version-Release number of selected component (if applicable):
# uname -r
2.6.32-342.el6.x86_64
qemu-kvm-0.12.1.2-2.334.el6.x86_64

How reproducible:
100%

Steps to Reproduce:
1./usr/libexec/qemu-kvm -enable-kvm -m 2G -smp 64,cores=4,thread=1,socket=40,maxcpus=160 -M rhel6.4.0 -name rhel6 -uuid ddcbfb49-3411-1701-3c36-6bdbc00bedb9 -rtc base=utc,clock=host,driftfix=slew -boot c -drive file=/mnt/win2k8r2-bak.raw,if=none,id=drive-virtio-0-1,format=raw,cache=none,werror=report,rerror=report -device virtio-blk-pci,drive=drive-virtio-0-1,id=virt0-0-1 -netdev tap,id=hostnet0 -device rtl8139,netdev=hostnet0,id=net0,mac=52:54:50:a4:c2:c5 -device virtio-balloon-pci,id=ballooning -monitor stdio  -qmp tcp:0:4455,server,nowait -drive file=5g.qcow2,format=qcow2,if=none,id=drive-disk,cache=none,werror=ignore,rerror=ignore -device virtio-blk-pci,scsi=off,drive=drive-disk,id=image -device sga -chardev socket,id=serial0,path=/var/test1,server,nowait -device isa-serial,chardev=serial0 -monitor unix:/tmp/monitor2,server,nowait -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -spice disable-ticketing,port=5911 -vga qxl

2.hotplug 160 vcpu to guest with this small script
i=1
while [ $i -lt 160 ]
do
sleep 2
echo "cpu_set $i online"|nc -U /tmp/monitor2
i=$(($i+1))
done

3.check vpu number in guest
  
Actual results:
64 vcpu are online from task manager--> performance in guest

Expected results:
160 vcpu are online

Additional info:

Comment 2 FuXiangChun 2012-11-23 01:48:04 UTC
Boot guest with -smp 1,cores=4,thread=1,socket=40,maxcpus=160, and reboot guest. then can show 160 vcpus from task manager--> performance in guest, so I think it isn't a bug.

Comment 3 juzhang 2012-11-23 03:34:45 UTC
Reopen this issue since KVM QE do not know whether upper management can handle this issue.

Issue summary
1. unexpected results   
1.1.-smp 64,cores=4,thread=1,socket=40,maxcpus=160
1.2.-smp 64,cores=1,thread=1,socket=1,maxcpus=160
1.3.-smp 64,cores=1,thread=1,socket=64,maxcpus=160
scenario 1:
     119 cpus are online in guest
scenario 2:
     64 cpus are online in guest
scenario 3:
     119 cpus are online in guest

2. expected results
2.1 -smp 1,cores=4,thread=1,socket=40,maxcpus=160

Results:
160 cpus are online in guest

Additional infos
KVM QE know the win2k8r2 maximize support cpu socket are 64, the problem is we do not now how the upper management handle -smp x, cores=x,thread=x,socket=x when boot a guest.

Comment 7 Igor Mammedov 2013-05-22 15:59:48 UTC
(In reply to comment #3)
> Reopen this issue since KVM QE do not know whether upper management can
> handle this issue.

Defining topology is up to management layer which knows what guest OS will be used.

Here is link on supported limits  for Windows Server:
http://blogs.technet.com/b/matthts/archive/2012/10/14/windows-server-sockets-logical-processors-symmetric-multi-threading.aspx


In addition to limits specified an above link, WS will not hotplug more that 8 CPUs if it was started with less than 9 CPUs.
In case it started with less then 9 CPUs, it will online up to 8 CPUs and only create CPU devices for the rest (and ask for restart to use them). If it's started with more than 8 CPUs it will online every hotplugged CPU up to supported limits.

It looks like WS limitation, so libvirt probably needs to take it into account (with instrumented ACPI in BIOS I've traced that WS is notified about all hotplugged CPUs and gets valid status and _MAT values for every hotplugged CPU).

May be there is other topology limitations but I wasn't able to find any docs about them.

> 
> Issue summary
> 1. unexpected results   
> 1.1.-smp 64,cores=4,thread=1,socket=40,maxcpus=160
> 1.2.-smp 64,cores=1,thread=1,socket=1,maxcpus=160
> 1.3.-smp 64,cores=1,thread=1,socket=64,maxcpus=160
> scenario 1:
>      119 cpus are online in guest
> scenario 2:
>      64 cpus are online in guest
> scenario 3:
>      119 cpus are online in guest
> 
> 2. expected results
> 2.1 -smp 1,cores=4,thread=1,socket=40,maxcpus=160
> 
> Results:
> 160 cpus are online in guest
> 
> Additional infos
> KVM QE know the win2k8r2 maximize support cpu socket are 64, the problem is
> we do not now how the upper management handle -smp x,
> cores=x,thread=x,socket=x when boot a guest.

Scenarios 1.2 and 1.3 are invalid due to unsupported topology by WS.

Scenario 1.1 works for me with RHEL and upstream qemu versions.
Guest takes time to online all CPUs so you have to wait till the end of it or count device nodes with processor type.

I've used following commands in guest to get online CPUs number:
---
# get number of online threads
get-wmiobject Win32_ComputerSystem -Property NumberOfLogicalProcessors

#get number of online sockets (i.e. socket with at least one online CPU)
get-wmiobject Win32_ComputerSystem -Property NumberOfProcessors

Comment 8 Igor Mammedov 2013-05-22 16:08:11 UTC
Reassigning to libvirt (as mgmt layer) so that it wouldn't be possible to start qemu with wrong/not supported topology for specific guests (i.e. WS).

Comment 9 Peter Krempa 2013-05-31 09:31:24 UTC
Libvirt isn't aware of the guest operating system that is used and doesn't store OS specific configuration details. This has to be done in even higher management layer like virt-manager, RHEV or others.

Comment 10 Peter Krempa 2013-05-31 09:52:37 UTC
I opened bz 969352 and bz 969354 to track the issue in the higher level management and I'm closing the libvirt version as CANTFIX.