Bug 1080436

Summary: [Intel 6.5.z Bug] virsh setvcpus can not setup correct vcpu number
Product: Red Hat Enterprise Linux 6 Reporter: Jan Kurik <jkurik>
Component: qemu-kvmAssignee: Laszlo Ersek <lersek>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: high    
Version: 6.5CC: acathrow, areis, armbru, bsarathy, chao.zhou, chayang, chegu_vinod, ctatman, dbayly, donald.d.dugger, dyuan, ehabkost, gsun, honzhang, imammedo, jamorgan, jane.lv, jiajun.xu, jkurik, joseph.szczypek, jshortt, jsvarova, juzhang, jvillalo, jwilleford, lersek, lisa.mitchell, lsu, michen, mkenneth, mrezanin, mtessun, nigel.croxon, pkrempa, pm-eus, qzhang, rhod, ruwang, tdosek, trinh.dao, virt-maint, will.auld, xfu, xiantao.zhang, xiaolong.wang
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: qemu-kvm-0.12.1.2-2.415.el6_5.7 Doc Type: Bug Fix
Doc Text:
When hot unplugging a virtual CPU (vCPU) from a guest using libvirt, the current Red Hat Enterprise Linux QEMU implementation does not remove the corresponding vCPU thread. Because of this, libvirt previously did not correctly perceive the vCPU count after a vCPU had been hot unplugged. Consequently, an error occured in libvirt, which prevented increasing the vCPU count after the hot unplug. In this update, information from QEMU is used to filter out inactive vCPU threads of disabled vCPUs, and the internal checks now pass and allow the hot plug.
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-04-03 14:31:30 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1017858    
Bug Blocks: 994246, 1001319, 1024339, 1066473, 1069533, 1080394, 1080439    

Description Jan Kurik 2014-03-25 12:46:33 UTC
This bug has been copied from bug #1017858 and has been proposed
to be backported to 6.5 z-stream (EUS).

Comment 3 Tomas Dosek 2014-03-25 12:53:54 UTC
*** Bug 1080393 has been marked as a duplicate of this bug. ***

Comment 9 Qunfang Zhang 2014-03-28 01:17:49 UTC
Hello, Laszlo 

Besides the bug verification and general auto test acceptance test, do you think we need to run some additional function test (eg, cpu hotplug)? Or bug verification and acceptance are enough? 

Thanks,
Qunfang

Comment 10 Qunfang Zhang 2014-03-28 07:23:30 UTC
Verified this bug on qemu-kvm-0.12.1.2-2.415.el6_5.7.x86_64.

Host version:
kernel-2.6.32-431.11.2.el6.x86_64
qemu-kvm-0.12.1.2-2.415.el6_5.7.x86_64
seabios-0.6.1.2-28.el6.x86_64

Guest:
kernel-2.6.32-431.el6.x86_64

Steps:
1. Boot up a guest with virt-manager with CPU maximum allocation 4 and current allocation is 2. 

[root@dell-per415-03 ~]# ps ax | grep kvm
 1151 ?        S      0:00 [kvm-irqfd-clean]
22447 ?        Sl     0:31 /usr/libexec/qemu-kvm -name rhel6 -S -M rhel6.5.0 -enable-kvm -m 2048 -realtime mlock=off -smp 2,maxcpus=4,sockets=4,cores=1,threads=1 -uuid 8299278b-f924-8c5b-0a2b-255abbdd356b -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/rhel6.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/var/lib/libvirt/images/RHEL-Server-6.5-64-virtio.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=25,id=hostnet0,vhost=on,vhostfd=26 -device virtio-net-pci,__com_redhat_macvtap_compat=on,netdev=hostnet0,id=net0,mac=52:54:00:0b:94:a6,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0 -vnc 127.0.0.1:0 -vga cirrus -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6

2. virsh # list  
 Id    Name                           State
----------------------------------------------------
 4     rhel6                          running

virsh # qemu-monitor-command rhel6 '{ "execute" : "query-cpus" }'
{"return":[{"enabled-in-acpi":true,"current":true,"CPU":0,"pc":-2130449717,"halted":true,"thread_id":22453},{"enabled-in-acpi":true,"current":false,"CPU":1,"pc":-2130449717,"halted":true,"thread_id":22454}],"id":"libvirt-8"}


3. Plug another cpu and check it.

virsh # setvcpus --live rhel6 3

virsh # 
virsh # qemu-monitor-command rhel6 '{ "execute" : "query-cpus" }'
{"return":[{"enabled-in-acpi":true,"current":true,"CPU":0,"pc":-2130449717,"halted":true,"thread_id":22453},{"enabled-in-acpi":true,"current":false,"CPU":1,"pc":-2130449717,"halted":true,"thread_id":22454},{"enabled-in-acpi":true,"current":false,"CPU":2,"pc":-2130449717,"halted":true,"thread_id":22568}],"id":"libvirt-12"}

Now, all the 3 vcpus "enabled-in-acpi" field is "true".

4. Plug the 4th vcpu and check it again.

virsh # setvcpus --live rhel6 4

virsh # 
virsh # qemu-monitor-command rhel6 '{ "execute" : "query-cpus" }'
{"return":[{"enabled-in-acpi":true,"current":true,"CPU":0,"pc":-2130449717,"halted":true,"thread_id":22453},{"enabled-in-acpi":true,"current":false,"CPU":1,"pc":-2130449717,"halted":true,"thread_id":22454},{"enabled-in-acpi":true,"current":false,"CPU":2,"pc":-2130449717,"halted":true,"thread_id":22568},{"enabled-in-acpi":true,"current":false,"CPU":3,"pc":-2130449717,"halted":true,"thread_id":22570}],"id":"libvirt-16"}

Now, all the 3 vcpus "enabled-in-acpi" field is "true".

5.  Hot unplug 2 vcpu and check it.

virsh # setvcpus --live rhel6 2
error: Operation not supported: qemu didn't unplug the vCPUs properly

virsh # 
virsh # qemu-monitor-command rhel6 '{ "execute" : "query-cpus" }'
{"return":[{"enabled-in-acpi":true,"current":true,"CPU":0,"pc":-2130449717,"halted":true,"thread_id":22453},{"enabled-in-acpi":true,"current":false,"CPU":1,"pc":-2130449717,"halted":true,"thread_id":22454},{"enabled-in-acpi":false,"current":false,"CPU":2,"pc":-2130505407,"halted":true,"thread_id":22568},{"enabled-in-acpi":false,"current":false,"CPU":3,"pc":-2130505407,"halted":true,"thread_id":22570}],"id":"libvirt-22"}

Now, the cpu 2 and cpu 3 "enabled-in-acpi" field is "false".

Based on above, this issue should be fixed. I'm waiting for Laszlo's double confirm in bug 1081462 for safe.

Comment 11 Laszlo Ersek 2014-03-28 10:46:45 UTC
(In reply to Qunfang Zhang from comment #9)

> Besides the bug verification and general auto test acceptance test, do you
> think we need to run some additional function test (eg, cpu hotplug)? Or bug
> verification and acceptance are enough? 

The libvirt fix and the qemu-kvm fix go together. Customers care about vcpu hot(un)plug as an "undivided" functionality. So it does make sense to upgrade both libvirt and qemu-kvm, and do end-to-end testing.

I didn't include / require success on the virsh level in the original testing steps because I wasn't sure about the scheduling of the libvirt bugfix. The qemu-kvm fix can be verified on the QMP level in isolation.

But, again, once you have both the libvirt and qemu-kvm fix in place, it's best to test end-to-end, which gives you verification for both the libvirt BZ and the qemu-kvm BZ.

Also, I answered your needinfo in bug 1081462.

Thanks
Laszlo

Comment 12 Laszlo Ersek 2014-03-28 10:52:09 UTC
            libvirt      qemu-kvm     qemu-kvm-rhev
RHEL-6.6    bug 1066473         bug 1017858
RHEL-6.5.z  bug 1080439  bug 1080436  bug 1081462

Comment 13 Qunfang Zhang 2014-03-31 06:45:49 UTC
(In reply to Laszlo Ersek from comment #11)
> (In reply to Qunfang Zhang from comment #9)
> 
> > Besides the bug verification and general auto test acceptance test, do you
> > think we need to run some additional function test (eg, cpu hotplug)? Or bug
> > verification and acceptance are enough? 
> 
> The libvirt fix and the qemu-kvm fix go together. Customers care about vcpu
> hot(un)plug as an "undivided" functionality. So it does make sense to
> upgrade both libvirt and qemu-kvm, and do end-to-end testing.
> 
> I didn't include / require success on the virsh level in the original
> testing steps because I wasn't sure about the scheduling of the libvirt
> bugfix. The qemu-kvm fix can be verified on the QMP level in isolation.
> 
> But, again, once you have both the libvirt and qemu-kvm fix in place, it's
> best to test end-to-end, which gives you verification for both the libvirt
> BZ and the qemu-kvm BZ.
> 
> Also, I answered your needinfo in bug 1081462.
> 
> Thanks
> Laszlo

Thank you, Laszlo. I tested again with the latest rhel6.5-z libvirt-0.10.2-29.el6_5.7.x86_64 installed. Now when hot unplug vcpu, virsh does not report error any more.

Test steps:

Same as comment 10.

Host version:
kernel-2.6.32-431.7.1.el6.x86_64
qemu-kvm-0.12.1.2-2.415.el6_5.7.x86_64
libvirt-0.10.2-29.el6_5.7.x86_64

[root@localhost ~]# virsh list 
 Id    Name                           State
----------------------------------------------------
 1     rhel6                          running

[root@localhost ~]# 
[root@localhost ~]# virsh 
Welcome to virsh, the virtualization interactive terminal.

Type:  'help' for help with commands
       'quit' to quit

virsh # 
virsh # qemu-monitor-command rhel6 '{ "execute" : "query-cpus" }'
{"return":[{"enabled-in-acpi":true,"current":true,"CPU":0,"pc":-2130449717,"halted":true,"thread_id":2499},{"enabled-in-acpi":true,"current":false,"CPU":1,"pc":-2130449717,"halted":true,"thread_id":2500}],"id":"libvirt-8"}

virsh # 
virsh # 
virsh #  setvcpus --live rhel6 3

virsh # 
virsh # qemu-monitor-command rhel6 '{ "execute" : "query-cpus" }'
{"return":[{"enabled-in-acpi":true,"current":true,"CPU":0,"pc":-2130449717,"halted":true,"thread_id":2499},{"enabled-in-acpi":true,"current":false,"CPU":1,"pc":-2130449717,"halted":true,"thread_id":2500},{"enabled-in-acpi":true,"current":false,"CPU":2,"pc":-2130449717,"halted":true,"thread_id":3205}],"id":"libvirt-12"}

virsh # 
virsh # 
virsh # setvcpus --live rhel6 4

virsh # 
virsh # 
virsh # qemu-monitor-command rhel6 '{ "execute" : "query-cpus" }'
{"return":[{"enabled-in-acpi":true,"current":true,"CPU":0,"pc":-2130449717,"halted":true,"thread_id":2499},{"enabled-in-acpi":true,"current":false,"CPU":1,"pc":-2130449717,"halted":true,"thread_id":2500},{"enabled-in-acpi":true,"current":false,"CPU":2,"pc":-2130449717,"halted":true,"thread_id":3205},{"enabled-in-acpi":true,"current":false,"CPU":3,"pc":-2130449717,"halted":true,"thread_id":3348}],"id":"libvirt-16"}

virsh # 
virsh # 
virsh # setvcpus --live rhel6 2

virsh # 
virsh # 
virsh # qemu-monitor-command rhel6 '{ "execute" : "query-cpus" }'
{"return":[{"enabled-in-acpi":true,"current":true,"CPU":0,"pc":-2130449717,"halted":true,"thread_id":2499},{"enabled-in-acpi":true,"current":false,"CPU":1,"pc":-2130449717,"halted":true,"thread_id":2500},{"enabled-in-acpi":false,"current":false,"CPU":2,"pc":-2130505407,"halted":true,"thread_id":3205},{"enabled-in-acpi":false,"current":false,"CPU":3,"pc":-2130505407,"halted":true,"thread_id":3348}],"id":"libvirt-22"}

virsh #

Comment 16 errata-xmlrpc 2014-04-03 14:31:30 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-0360.html