Bug 1017858

Summary: [Intel 6.6 Bug] virsh setvcpus can not setup correct vcpu number
Product: Red Hat Enterprise Linux 6 Reporter: chao.zhou
Component: qemu-kvmAssignee: Laszlo Ersek <lersek>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: high    
Version: 6.5CC: areis, armbru, bsarathy, chao.zhou, chayang, chegu_vinod, ctatman, dbayly, dyuan, ehabkost, gsun, honzhang, imammedo, jamorgan, jane.lv, jiajun.xu, jkurik, jmiao, joseph.szczypek, jshortt, jsvarova, juzhang, jwilleford, keve.a.gabbert, lersek, lisa.mitchell, lsu, michen, mkenneth, mtessun, nigel.croxon, peterm, pkrempa, qzhang, rbalakri, rpacheco, ruwang, s.kieske, tdosek, trinh.dao, virt-maint, will.auld, xfu, xiantao.zhang, xiaolong.wang
Target Milestone: rcKeywords: ZStream
Target Release: 6.6   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: qemu-kvm-0.12.1.2-2.423.el6 Doc Type: Bug Fix
Doc Text:
When hot unplugging a virtual CPU (vCPU) from a guest using libvirt, the current Red Hat Enterprise Linux QEMU implementation does not remove the corresponding vCPU thread. Because of this, libvirt previously did not correctly perceive the vCPU count after a vCPU had been hot unplugged. Consequently, an error occured in libvirt, which prevented increasing the vCPU count after the hot unplug. In this update, information from QEMU is used to filter out inactive vCPU threads of disabled vCPUs, and the internal checks now pass and allow the hot plug.
Story Points: ---
Clone Of:
: 1080393 (view as bug list) Environment:
Last Closed: 2014-10-14 06:52:14 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 994246, 1001319, 1024339, 1066473, 1069533, 1080393, 1080394, 1080436, 1081462, 1122165, 1123495    

Description chao.zhou 2013-10-10 15:56:31 UTC
Description of problem:

boot up a rhel6.5 guest with 2 vcpus and 10 maximum vcpus

virsh # vcpucount rhel6u5
maximum      config        10
maximum      live          10
current      config         2
current      live           2

virsh # qemu-agent-command rhel6u5 '{"execute":"guest-get-vcpus"}'
{"return":[{"online":true,"can-offline":false,"logical-id":0},{"online":true,"can-offline":true,"logical-id":1}]}

Then set vcpu number to 6

virsh # setvcpus rhel6u5 6

virsh # vcpucount rhel6u5
maximum      config        10
maximum      live          10
current      config         2
current      live           6

virsh # qemu-agent-command rhel6u5 '{"execute":"guest-get-vcpus"}'
{"return":[{"online":true,"can-offline":false,"logical-id":0},{"online":true,"can-offline":true,"logical-id":1},{"online":true,"can-offline":true,"logical-id":2},{"online":true,"can-offline":true,"logical-id":3},{"online":true,"can-offline":true,"logical-id":4},{"online":true,"can-offline":true,"logical-id":5}]}

There are 6 vcpus online in guest.

Then set vcpu number to 4

virsh # setvcpus rhel6u5 4

There will be an error:

error: Operation not supported: qemu didn't unplug the vCPUs properly

vcpucount will still report there are 6 active vcpus in guest.

virsh # vcpucount rhel6u5
maximum      config        10
maximum      live          10
current      config         2
current      live           6

but from qemu-guest-agent, it reports vcpu number has been set to 4

virsh # qemu-agent-command rhel6u5 '{"execute":"guest-get-vcpus"}'
{"return":[{"online":true,"can-offline":false,"logical-id":0},{"online":true,"can-offline":true,"logical-id":1},{"online":true,"can-offline":true,"logical-id":2},{"online":true,"can-offline":true,"logical-id":3}]}

Then set vcpu number to 8

virsh # setvcpus rhel6u5 8

virsh # vcpucount rhel6u5
maximum      config        10
maximum      live          10
current      config         2
current      live           8

vcpucount shows there're 8 active vcpus, but qemu-guest-agent still shows 4 active vcpus in guest

virsh # qemu-agent-command rhel6u5 '{"execute":"guest-get-vcpus"}'
{"return":[{"online":true,"can-offline":false,"logical-id":0},{"online":true,"can-offline":true,"logical-id":1},{"online":true,"can-offline":true,"logical-id":2},{"online":true,"can-offline":true,"logical-id":3}]}


Version-Release number of selected component (if applicable):

libvirt-0.10.2-27.el6.x86_64
qemu-kvm-0.12.1.2-2.406.el6.x86_64
kernel-2.6.32-420.el6.x86_64

How reproducible:

Always

Steps to Reproduce:
1. create a guest and setup a maximum vcpu number
2. hot-add and hot-remove vcpu through virsh setvcpus
3.

Actual results:


Expected results:


Additional info:

Comment 2 Wayne Sun 2013-10-11 05:47:05 UTC
pkgs:
libvirt-0.10.2-29.el6.x86_64
qemu-kvm-0.12.1.2-2.410.el6.x86_64
kernel-2.6.32-421.el6.x86_64

steps:
# virsh vcpucount kvm-rhel6.4-x86_64-qcow2-virtio
maximum      config        10
maximum      live          10
current      config         2
current      live           2


# virsh qemu-agent-command kvm-rhel6.4-x86_64-qcow2-virtio '{"execute":"guest-get-vcpus"}'
{"return":[{"online":true,"can-offline":false,"logical-id":0},{"online":true,"can-offline":true,"logical-id":1}]}

# virsh setvcpus kvm-rhel6.4-x86_64-qcow2-virtio 6

# virsh vcpucount kvm-rhel6.4-x86_64-qcow2-virtio
maximum      config        10
maximum      live          10
current      config         2
current      live           6

# virsh qemu-agent-command kvm-rhel6.4-x86_64-qcow2-virtio '{"execute":"guest-get-vcpus"}'
{"return":[{"online":true,"can-offline":false,"logical-id":0},{"online":true,"can-offline":true,"logical-id":1},{"online":true,"can-offline":true,"logical-id":2},{"online":true,"can-offline":true,"logical-id":3},{"online":true,"can-offline":true,"logical-id":4},{"online":true,"can-offline":true,"logical-id":5}]}

# virsh setvcpus kvm-rhel6.4-x86_64-qcow2-virtio 4
error: Operation not supported: qemu didn't unplug the vCPUs properly

# virsh vcpucount kvm-rhel6.4-x86_64-qcow2-virtio
maximum      config        10
maximum      live          10
current      config         2
current      live           6

# virsh qemu-agent-command kvm-rhel6.4-x86_64-qcow2-virtio '{"execute":"guest-get-vcpus"}'
{"return":[{"online":true,"can-offline":false,"logical-id":0},{"online":true,"can-offline":true,"logical-id":1},{"online":true,"can-offline":true,"logical-id":2},{"online":true,"can-offline":true,"logical-id":3},{"online":true,"can-offline":true,"logical-id":4},{"online":true,"can-offline":true,"logical-id":5}]}

still 6 online, so I can't reproduce this, can you update to latest qemu and have a try?

Comment 3 Wayne Sun 2013-10-11 06:02:11 UTC
For hot-unplug, only guest-agent based vcpu hot-unplug is supported, check out bug 924400.

Comment 4 chao.zhou 2013-10-11 08:15:35 UTC
I see you're using rhel6.4 as guest, I've tried rhel6.4 guest, it works fine as you did, but when using rhel6.5 guest, the bug can be reproduced.

(In reply to Wayne Sun from comment #2)
> pkgs:
libvirt-0.10.2-29.el6.x86_64
qemu-kvm-0.12.1.2-2.410.el6.x86_64
> kernel-2.6.32-421.el6.x86_64

steps:
# virsh vcpucount
> kvm-rhel6.4-x86_64-qcow2-virtio
maximum      config        10
maximum     
> live          10
current      config         2
current      live           2
> # virsh qemu-agent-command kvm-rhel6.4-x86_64-qcow2-virtio
> '{"execute":"guest-get-vcpus"}'
> {"return":[{"online":true,"can-offline":false,"logical-id":0},{"online":true,
> "can-offline":true,"logical-id":1}]}

# virsh setvcpus
> kvm-rhel6.4-x86_64-qcow2-virtio 6

# virsh vcpucount
> kvm-rhel6.4-x86_64-qcow2-virtio
maximum      config        10
maximum     
> live          10
current      config         2
current      live           6
> # virsh qemu-agent-command kvm-rhel6.4-x86_64-qcow2-virtio
> '{"execute":"guest-get-vcpus"}'
> {"return":[{"online":true,"can-offline":false,"logical-id":0},{"online":true,
> "can-offline":true,"logical-id":1},{"online":true,"can-offline":true,
> "logical-id":2},{"online":true,"can-offline":true,"logical-id":3},{"online":
> true,"can-offline":true,"logical-id":4},{"online":true,"can-offline":true,
> "logical-id":5}]}

# virsh setvcpus kvm-rhel6.4-x86_64-qcow2-virtio 4
error:
> Operation not supported: qemu didn't unplug the vCPUs properly

# virsh
> vcpucount kvm-rhel6.4-x86_64-qcow2-virtio
maximum      config        10
> maximum      live          10
current      config         2
current     
> live           6

# virsh qemu-agent-command kvm-rhel6.4-x86_64-qcow2-virtio
> '{"execute":"guest-get-vcpus"}'
> {"return":[{"online":true,"can-offline":false,"logical-id":0},{"online":true,
> "can-offline":true,"logical-id":1},{"online":true,"can-offline":true,
> "logical-id":2},{"online":true,"can-offline":true,"logical-id":3},{"online":
> true,"can-offline":true,"logical-id":4},{"online":true,"can-offline":true,
> "logical-id":5}]}

still 6 online, so I can't reproduce this, can you update
> to latest qemu and have a try?

Comment 5 chao.zhou 2013-10-11 08:17:12 UTC
Thanks! But I'm not authorized to access this bug, can you grant my access?

(In reply to Wayne Sun from comment #3)
> For hot-unplug, only guest-agent based vcpu hot-unplug is supported, check
> out bug 924400.

Comment 6 Wayne Sun 2013-10-11 09:18:16 UTC
(In reply to chao.zhou from comment #5)
> Thanks! But I'm not authorized to access this bug, can you grant my access?
> 
> (In reply to Wayne Sun from comment #3)
> > For hot-unplug, only guest-agent based vcpu hot-unplug is supported, check
> > out bug 924400.

No need to check the bug description, all steps are in comment 17:

https://bugzilla.redhat.com/show_bug.cgi?id=924400#c17

changed to a rhel6.5 guest:
# uname -r
2.6.32-422.el6.x86_64

# cat /etc/redhat-release 
Red Hat Enterprise Linux Server release 6.5 Beta (Santiago)

# rpm -q qemu-guest-agent
qemu-guest-agent-0.12.1.2-2.412.el6.x86_64

On host:
# virsh vcpucount kvm-rhel6.5-x86_64-qcow2-virtio
maximum      config        10
maximum      live          10
current      config         2
current      live           2

# virsh qemu-agent-command kvm-rhel6.5-x86_64-qcow2-virtio '{"execute":"guest-get-vcpus"}'
{"return":[{"online":true,"can-offline":false,"logical-id":0},{"online":true,"can-offline":true,"logical-id":1}]}

# virsh setvcpus kvm-rhel6.5-x86_64-qcow2-virtio 6

# virsh vcpucount kvm-rhel6.5-x86_64-qcow2-virtio
maximum      config        10
maximum      live          10
current      config         2
current      live           6

# virsh qemu-agent-command kvm-rhel6.5-x86_64-qcow2-virtio '{"execute":"guest-get-vcpus"}'
{"return":[{"online":true,"can-offline":false,"logical-id":0},{"online":true,"can-offline":true,"logical-id":1},{"online":true,"can-offline":true,"logical-id":2},{"online":true,"can-offline":true,"logical-id":3},{"online":true,"can-offline":true,"logical-id":4},{"online":true,"can-offline":true,"logical-id":5}]}

# virsh setvcpus kvm-rhel6.5-x86_64-qcow2-virtio 4
error: Operation not supported: qemu didn't unplug the vCPUs properly

# virsh vcpucount kvm-rhel6.5-x86_64-qcow2-virtio
maximum      config        10
maximum      live          10
current      config         2
current      live           6

# virsh qemu-agent-command kvm-rhel6.5-x86_64-qcow2-virtio '{"execute":"guest-get-vcpus"}'
{"return":[{"online":true,"can-offline":false,"logical-id":0},{"online":true,"can-offline":true,"logical-id":1},{"online":true,"can-offline":true,"logical-id":2},{"online":true,"can-offline":true,"logical-id":3}]}

yes, it did changed.

Check in guest:
# cat /sys/devices/system/cpu/cpu*/online
1
1
1

include cpu0, 4 cpus online, so there is a problem.

Comment 7 chao.zhou 2013-10-11 09:30:34 UTC
Can't access to comment 17 either.


(In reply to Wayne Sun from comment #6)
> (In reply to chao.zhou from comment #5)
> Thanks! But I'm not authorized to
> access this bug, can you grant my access?
> 
> (In reply to Wayne Sun from
> comment #3)
> > For hot-unplug, only guest-agent based vcpu hot-unplug is
> supported, check
> > out bug 924400.

No need to check the bug description,
> all steps are in comment 17:
> https://bugzilla.redhat.com/show_bug.cgi?id=924400#c17

changed to a rhel6.5
> guest:
# uname -r
2.6.32-422.el6.x86_64

# cat /etc/redhat-release 
Red Hat
> Enterprise Linux Server release 6.5 Beta (Santiago)

# rpm -q
> qemu-guest-agent
qemu-guest-agent-0.12.1.2-2.412.el6.x86_64

On host:
#
> virsh vcpucount kvm-rhel6.5-x86_64-qcow2-virtio
maximum      config       
> 10
maximum      live          10
current      config         2
current     
> live           2

# virsh qemu-agent-command kvm-rhel6.5-x86_64-qcow2-virtio
> '{"execute":"guest-get-vcpus"}'
> {"return":[{"online":true,"can-offline":false,"logical-id":0},{"online":true,
> "can-offline":true,"logical-id":1}]}

# virsh setvcpus
> kvm-rhel6.5-x86_64-qcow2-virtio 6

# virsh vcpucount
> kvm-rhel6.5-x86_64-qcow2-virtio
maximum      config        10
maximum     
> live          10
current      config         2
current      live           6
> # virsh qemu-agent-command kvm-rhel6.5-x86_64-qcow2-virtio
> '{"execute":"guest-get-vcpus"}'
> {"return":[{"online":true,"can-offline":false,"logical-id":0},{"online":true,
> "can-offline":true,"logical-id":1},{"online":true,"can-offline":true,
> "logical-id":2},{"online":true,"can-offline":true,"logical-id":3},{"online":
> true,"can-offline":true,"logical-id":4},{"online":true,"can-offline":true,
> "logical-id":5}]}

# virsh setvcpus kvm-rhel6.5-x86_64-qcow2-virtio 4
error:
> Operation not supported: qemu didn't unplug the vCPUs properly

# virsh
> vcpucount kvm-rhel6.5-x86_64-qcow2-virtio
maximum      config        10
> maximum      live          10
current      config         2
current     
> live           6

# virsh qemu-agent-command kvm-rhel6.5-x86_64-qcow2-virtio
> '{"execute":"guest-get-vcpus"}'
> {"return":[{"online":true,"can-offline":false,"logical-id":0},{"online":true,
> "can-offline":true,"logical-id":1},{"online":true,"can-offline":true,
> "logical-id":2},{"online":true,"can-offline":true,"logical-id":3}]}

yes, it
> did changed.

Check in guest:
# cat /sys/devices/system/cpu/cpu*/online
1
1
> 1

include cpu0, 4 cpus online, so there is a problem.

Comment 8 Wayne Sun 2013-10-11 09:44:08 UTC
(In reply to chao.zhou from comment #7)
> Can't access to comment 17 either.
> 

Ok, I'm not supposed to update the authorized group for it, so just add the info here. 

For guest-agent vcpu hot-unplug, the --guest option is added for this.

e.g. do vcpucount and setvcpus with --guest just after done steps in comment #6 : 
# virsh vcpucount kvm-rhel6.5-x86_64-qcow2-virtio --guest
4

# virsh setvcpus kvm-rhel6.5-x86_64-qcow2-virtio 3 --guest

# virsh vcpucount kvm-rhel6.5-x86_64-qcow2-virtio --guest
3

# virsh vcpucount kvm-rhel6.5-x86_64-qcow2-virtio
maximum      config        10
maximum      live          10
current      config         2
current      live           6

with or without --guest will show different result as for vcpucount and vcpuinfo here.

Comment 9 Jason Willeford 2013-10-11 17:01:45 UTC
Since severity is medium and issue is not a data corruption issue or boot issue, moving to 6.6.

Comment 10 Peter Krempa 2013-10-14 15:03:45 UTC
When decreasing the count of vCPUs in a guest using the virsh setvcpus commands, the guest actually offlines the vCPU in question (visible using the guest agent as described, in the kernel log buffer and in the /sys/filesystem) but qemu doesn't remove the corresponding emulator thread. The libvirt call then fails as it re-detects CPUs afterwards.

A request for increasing the count of processors then doesn't re-plug cpus that were offlined in this way although qemu reports them in "info cpus".

If you start a guest with 1 vcpu, hotplug one, unlpug one and plug 2 your guest will only see 2 vcpus.

Moving to qemu-kvm for further investigation.

Comment 11 Laszlo Ersek 2013-10-18 09:28:39 UTC
"virsh setvcpus" and "virsh setvcpus --guest" manage different things.

The former tries to do real VCPU hot(un)plug. The unplug part of that
doesn't work; it's a known limitation.

The latter instructs the guest agent to online/offline some of the VCPUs
that exist as a result of any hotplugging.

That is,

    max allocation >= current allocation >= online count

IIUC, comment 6 does the following:

(1) Starts a guest with current_alloc == online_count == 2

(2) Hotplugs 4 further VCPUs, current_alloc == online_count == 6. I'm
    surprised this works in RHEL-6 qemu-kvm as I've been under the
    impression that real hotplug works in upstream qemu only.

(3) Tries to hot-unplug 2 VCPUs. This operation is currently not supported
    by RHEL-6 *or* upstream qemu, as far as I can tell. Strangely, the guest
    agent does seem to notice the change.


Let's see the monitor and guest agent command traffic between libvirt and
qemu-kvm and the guest agent while reproducing this.

Versions:
- libvirt-0.10.2-29.el6.x86_64
- qemu-kvm-0.12.1.2-2.414.el6.x86_64
- seabios-0.6.1.2-28.el6.x86_64
- guest kernel: 2.6.32-424.el6.x86_64
- guest agent: qemu-guest-agent-0.12.1.2-2.414.el6.x86_64

(a) start guest with
    max_alloc = 4
    current_alloc = 2
    online_count = 2

# virsh vcpucount seabios.rhel6
  maximum      config         4
  maximum      live           4
  current      config         2
  current      live           2


(b) hotplug 2 VCPUs

# virsh setvcpus seabios.rhel6 4
[succeeds]

Command traffic:

#1
  qemuMonitorJSONCommandWithFd:267 : Send command '{"execute":"cpu_set",
                                   "arguments":{"cpu":2,"state":"online"},
                                   "id":"libvirt-8"}' for write with FD -1
  qemuMonitorJSONIOProcessLine:154 : Line [{"id": "libvirt-8", "error":
                                   {"class": "CommandNotFound", "desc": "The
                                   command cpu_set has not been found",
                                   "data": {"name": "cpu_set"}}}]
  qemuMonitorJSONIOProcessLine:174 : QEMU_MONITOR_RECV_REPLY:
                                   mon=0x7fca0c015660 reply={"id":
                                   "libvirt-8", "error": {"class":
                                   "CommandNotFound", "desc": "The command
                                   cpu_set has not been found", "data":
                                   {"name": "cpu_set"}}}
  qemuMonitorJSONIOProcess:225     : Total used 139 bytes out of 139
                                   available in buffer
  qemuMonitorJSONCommandWithFd:272 : Receive command reply ret=0
                                   rxObject=0x9c9fb0

#2
  qemuMonitorJSONSetCPU:2159       : cpu_set command not found, trying HMP
  qemuMonitorJSONCommandWithFd:267 : Send command
                                   '{"execute":"human-monitor-command",
                                   "arguments":{"command-line":"cpu_set 2
                                   online"},"id":"libvirt-9"}' for write
                                   with FD -1
  qemuMonitorJSONIOProcessLine:154 : Line [{"return": {}, "id":
                                   "libvirt-9"}]
  qemuMonitorJSONIOProcessLine:174 : QEMU_MONITOR_RECV_REPLY:
                                   mon=0x7fca0c015660 reply={"return": {},
                                   "id": "libvirt-9"}
  qemuMonitorJSONIOProcess:225     : Total used 35 bytes out of 35 available
                                   in buffer
  qemuMonitorJSONCommandWithFd:272 : Receive command reply ret=0
                                   rxObject=0x9c8ec0

#3
  qemuMonitorJSONCommandWithFd:267 : Send command '{"execute":"cpu_set",
                                   "arguments":{"cpu":3,"state":"online"},
                                   "id":"libvirt-10"}' for write with FD -1
  qemuMonitorJSONIOProcessLine:154 : Line [{"id": "libvirt-10", "error":
                                   {"class": "CommandNotFound", "desc": "The
                                   command cpu_set has not been found",
                                   "data": {"name": "cpu_set"}}}]
  qemuMonitorJSONIOProcessLine:174 : QEMU_MONITOR_RECV_REPLY:
                                   mon=0x7fca0c015660 reply={"id":
                                   "libvirt-10", "error": {"class":
                                   "CommandNotFound", "desc": "The command
                                   cpu_set has not been found", "data":
                                   {"name": "cpu_set"}}}
  qemuMonitorJSONIOProcess:225     : Total used 140 bytes out of 140
                                   available in buffer
  qemuMonitorJSONCommandWithFd:272 : Receive command reply ret=0
                                   rxObject=0x9c5670

#4
  qemuMonitorJSONSetCPU:2159       : cpu_set command not found, trying HMP
  qemuMonitorJSONCommandWithFd:267 : Send command
                                   '{"execute":"human-monitor-command",
                                   "arguments":{"command-line":"cpu_set 3
                                   online"},"id":"libvirt-11"}' for write
                                   with FD -1
  qemuMonitorJSONIOProcessLine:154 : Line [{"return": {}, "id":
                                   "libvirt-11"}]
  qemuMonitorJSONIOProcessLine:174 : QEMU_MONITOR_RECV_REPLY:
                                   mon=0x7fca0c015660 reply={"return": {},
                                   "id": "libvirt-11"}
  qemuMonitorJSONIOProcess:225     : Total used 36 bytes out of 36 available
                                   in buffer
  qemuMonitorJSONCommandWithFd:272 : Receive command reply ret=0
                                   rxObject=0x9e8490

#5
  qemuMonitorJSONCommandWithFd:267 : Send command '{"execute":"query-cpus",
                                   "id":"libvirt-12"}' for write with FD -1
  qemuMonitorJSONIOProcessLine:154 : Line [{"return": [{"current": true,
                                   "CPU": 0, "pc": -2128361938, "halted":
                                   false, "thread_id": 5131}, {"current":
                                   false, "CPU": 1, "pc": -2128361938,
                                   "halted": false, "thread_id": 5132},
                                   {"current": false, "CPU": 2, "pc":
                                   4294967280, "halted": false, "thread_id":
                                   5392}, {"current": false, "CPU": 3, "pc":
                                   4294967280, "halted": false, "thread_id":
                                   5393}], "id": "libvirt-12"}]
  qemuMonitorJSONIOProcessLine:174 : QEMU_MONITOR_RECV_REPLY:
                                   mon=0x7fca0c015660 reply={"return":
                                   [{"current": true, "CPU": 0, "pc":
                                   -2128361938, "halted": false,
                                   "thread_id": 5131}, {"current": false,
                                   "CPU": 1, "pc": -2128361938, "halted":
                                   false, "thread_id": 5132}, {"current":
                                   false, "CPU": 2, "pc": 4294967280,
                                   "halted": false, "thread_id": 5392},
                                   {"current": false, "CPU": 3, "pc":
                                   4294967280, "halted": false, "thread_id":
                                   5393}], "id": "libvirt-12"}
  qemuMonitorJSONIOProcess:225     : Total used 371 bytes out of 371
                                   available in buffer
  qemuMonitorJSONCommandWithFd:272 : Receive command reply ret=0
                                   rxObject=0x9c2370

So, separately for each of the two new VCPUs, libvirt first tries to call
the "cpu_set" QMP command, and when that fails, it calls the "cpu_set" HMP
command. In the end it verifies current_alloc with the "query-cpus" command.

The guest dmesg says

  CPU 2 got hotplugged
  CPU 3 got hotplugged
  Booting Node 0 Processor 2 APIC 0x2
  kvm-clock: cpu 2, msr 0:23167c1, secondary cpu clock
  Disabled fast string operations
  TSC synchronization [CPU#0 -> CPU#2]:
  Measured 368227350316 cycles TSC warp between CPUs, turning off TSC clock.
  Marking TSC unstable due to check_tsc_sync_source failed
  kvm-stealtime: cpu 2, msr 230e880
  Will online and init hotplugged CPU: 2
  Booting Node 0 Processor 3 APIC 0x3
  kvm-clock: cpu 3, msr 0:23967c1, secondary cpu clock
  Disabled fast string operations
  kvm-stealtime: cpu 3, msr 238e880
  Will online and init hotplugged CPU: 3

And /proc/cpuinfo in the guest reports processors 0 to 3 inclusive (ie.
online_count==4).

On the host,

# virsh vcpucount seabios.rhel6
maximum      config         4
maximum      live           4
current      config         2
current      live           4

Therefoce max_alloc == current_alloc == online_count == 4.


(c) try to hot-unplug 2 VCPUs:

# virsh setvcpus seabios.rhel6 2
error: Operation not supported: qemu didn't unplug the vCPUs properly

Command traffic:

#1
  qemuMonitorJSONCommandWithFd:267 : Send command '{"execute":"cpu_set",
                                   "arguments":{"cpu":3,"state":"offline"},
                                   "id":"libvirt-13"}' for write with FD -1

  qemuMonitorJSONIOProcessLine:154 : Line [{"id": "libvirt-13", "error":
                                   {"class": "CommandNotFound", "desc": "The
                                   command cpu_set has not been found",
                                   "data": {"name": "cpu_set"}}}]
  qemuMonitorJSONIOProcessLine:174 : QEMU_MONITOR_RECV_REPLY:
                                   mon=0x7fca0c015660 reply={"id":
                                   "libvirt-13", "error": {"class":
                                   "CommandNotFound", "desc": "The command
                                   cpu_set has not been found", "data":
                                   {"name": "cpu_set"}}}
  qemuMonitorJSONIOProcess:225     : Total used 140 bytes out of 140
                                   available in buffer
  qemuMonitorJSONCommandWithFd:272 : Receive command reply ret=0
                                   rxObject=0x9c2200
  qemuMonitorJSONSetCPU:2159       : cpu_set command not found, trying HMP

#2
  qemuMonitorJSONCommandWithFd:267 : Send command
                                   '{"execute":"human-monitor-command",
                                   "arguments":{"command-line":"cpu_set 3
                                   offline"},"id":"libvirt-14"}' for write
                                   with FD -1
  qemuMonitorJSONIOProcessLine:154 : Line [{"return": {}, "id":
                                   "libvirt-14"}]
  qemuMonitorJSONIOProcessLine:174 : QEMU_MONITOR_RECV_REPLY:
                                   mon=0x7fca0c015660 reply={"return": {},
                                   "id": "libvirt-14"}
  qemuMonitorJSONIOProcess:225     : Total used 36 bytes out of 36 available
                                   in buffer
  qemuMonitorJSONCommandWithFd:272 : Receive command reply ret=0
                                   rxObject=0x9c3ac0

#3
  qemuMonitorJSONCommandWithFd:267 : Send command '{"execute":"cpu_set",
                                   "arguments":{"cpu":2,"state":"offline"},
                                   "id":"libvirt-15"}' for write with FD -1

  qemuMonitorJSONIOProcessLine:154 : Line [{"id": "libvirt-15", "error":
                                   {"class": "CommandNotFound", "desc": "The
                                   command cpu_set has not been found",
                                   "data": {"name": "cpu_set"}}}]
  qemuMonitorJSONIOProcessLine:174 : QEMU_MONITOR_RECV_REPLY:
                                   mon=0x7fca0c015660 reply={"id":
                                   "libvirt-15", "error": {"class":
                                   "CommandNotFound", "desc": "The command
                                   cpu_set has not been found", "data":
                                   {"name": "cpu_set"}}}
  qemuMonitorJSONIOProcess:225     : Total used 140 bytes out of 140
                                   available in buffer
  qemuMonitorJSONCommandWithFd:272 : Receive command reply ret=0
                                   rxObject=0x9c3eb0
  qemuMonitorJSONSetCPU:2159       : cpu_set command not found, trying HMP

#4
  qemuMonitorJSONCommandWithFd:267 : Send command
                                   '{"execute":"human-monitor-command",
                                   "arguments":{"command-line":"cpu_set 2
                                   offline"},"id":"libvirt-16"}' for write
                                   with FD -1
  qemuMonitorJSONIOProcessLine:154 : Line [{"return": {}, "id":
                                   "libvirt-16"}]
  qemuMonitorJSONIOProcessLine:174 : QEMU_MONITOR_RECV_REPLY:
                                   mon=0x7fca0c015660 reply={"return": {},
                                   "id": "libvirt-16"}
  qemuMonitorJSONIOProcess:225     : Total used 36 bytes out of 36 available
                                   in buffer
  qemuMonitorJSONCommandWithFd:272 : Receive command reply ret=0
                                   rxObject=0x9c2f80

#5
  qemuMonitorJSONCommandWithFd:267 : Send command '{"execute":"query-cpus",
                                   "id":"libvirt-17"}' for write with FD -1

  qemuMonitorJSONIOProcessLine:154 : Line [{"return": [{"current": true,
                                   "CPU": 0, "pc": -2130445398, "halted":
                                   false, "thread_id": 5131}, {"current":
                                   false, "CPU": 1, "pc": -2130449205,
                                   "halted": true, "thread_id": 5132},
                                   {"current": false, "CPU": 2, "pc":
                                   -2130449205, "halted": true, "thread_id":
                                   5392}, {"current": false, "CPU": 3, "pc":
                                   -2130449205, "halted": true, "thread_id":
                                   5393}], "id": "libvirt-17"}]
  qemuMonitorJSONIOProcessLine:174 : QEMU_MONITOR_RECV_REPLY:
                                   mon=0x7fca0c015660 reply={"return":
                                   [{"current": true, "CPU": 0, "pc":
                                   -2130445398, "halted": false,
                                   "thread_id": 5131}, {"current": false,
                                   "CPU": 1, "pc": -2130449205, "halted":
                                   true, "thread_id": 5132}, {"current":
                                   false, "CPU": 2, "pc": -2130449205,
                                   "halted": true, "thread_id": 5392},
                                   {"current": false, "CPU": 3, "pc":
                                   -2130449205, "halted": true, "thread_id":
                                   5393}], "id": "libvirt-17"}
  qemuMonitorJSONIOProcess:225     : Total used 370 bytes out of 370
                                   available in buffer
  qemuMonitorJSONCommandWithFd:272 : Receive command reply ret=0
                                   rxObject=0x9c2c20
  qemudDomainHotplugVcpus:3960     : Operation not supported: qemu didn't
                                   unplug the vCPUs properly


Libvirt sends two "cpu_set" commands similarly, in reverse order, always
trying QMP first, then HMP. The final "query-cpus" command reports 4 VCPUs,
3 of them halted. (After they had been hotplugged in step (b), none were
reported halted.)

guest dmesg:

  CPU 3 is now offline
  CPU 2 is now offline

guest /proc/cpuinfo: reports processors 0 to 1 inclusive (ie. online_count
== 2)

Still in the guest:

  # ls -d -1 /sys/devices/system/cpu/cpu?
  /sys/devices/system/cpu/cpu0
  /sys/devices/system/cpu/cpu1

This means that VCPUs 2 and 3 have *really* been hot-unplugged, not just
offlined.

This can be confirmed with the following command on the host:

# virsh qemu-agent-command seabios.rhel6 '{"execute":"guest-get-vcpus"}'
  {"return":[{"online":true,"can-offline":false,"logical-id":0},
             {"online":true,"can-offline":true,"logical-id":1}]}

So, there are 2 VCPUs in total now; *not* 4 VCPUs with 2 of them offline.

Still on the host:

  # virsh vcpucount seabios.rhel6
  maximum      config         4
  maximum      live           4
  current      config         2
  current      live           4

Libvirt has no knowledge of the change.


In summary:
- We've used the "cpu_set" HMP command for both hotplug and hot-unplug.
  (Implemented by qemu_system_cpu_hot_add() in "hw/acpi.c".)

- We have not used the guest agent for VCPU onlining/offlining.

- Looking at the guest kernel, both hotplug and hot-unplug seem to work
  correctly.

- The "query-cpus" QMP command does not reflect the result of hot-unplug.

- This misleads libvirt.

- After this point it makes no sense to expect further VCPU operations to
  work, since libvirt (via "query-cpus") and the guest kernel have a
  de-synchronized view of the number of VCPUs.

I think that "query-cpus" may not be an appropriate operation to verify the
success of "cpu_set". This command is implemented by do_info_cpus()
[monitor.c], and the result depends on the number of CPUState objects (VCPU
threads?) that qemu maintains. This number does not decrease at hot-unplug.

So,
- either the number of CPUState objects should decrease at hot-unplug (so
  that "query-cpus" would report the correct number when it reports the
  number of CPUState objects),

- or "query-cpus" should key its retval off some other quantity, so that it
  would match the guest kernel's worldview,

- or libvirt should call something else than "query-cpus" to verify the
  result of hotplug / hot-unplug.


... After some more thinking, I think we have the following four layers:

(i)   max allocation
(ii)  current allocation
(iii) ACPI online / offline
(iv)  guest kernel online / offline

The "query-cpus" command works on (ii).

The "cpu_set" command works on (ii) *and* (iii) when adding a completely new
VCPU:

  qemu_system_cpu_hot_add()
    pc_new_cpu()              --> (ii)
    enable_processor()        --> (iii)

However "cpu_set" only affects (iii) when removing a VCPU:

  qemu_system_cpu_hot_add()
    disable_processor()       --> (iii)

This kills off the VCPU in the guest kernel for good (see step (c) above),
but qemu will retain all related resources, hence the discrepancy in the
output of "query-cpus".

The guest agent works on (iv).


For now I would recommend to use

  virsh setvcpus --guest

only, which changes the online count inside the guest (iv):

  max_alloc == current_alloc == ACPI_online_count >= guest_online_count

It does not involve "query-cpus", "cpu_set", or ACPI; only the guest agent
("guest-set-vcpus" & "guest-get-vcpus") and the guest kernel's sysfs.


Igor, any thoughts? Thanks.

Comment 12 Qunfang Zhang 2013-11-07 10:28:38 UTC
Reproduced this bug on qemu-kvm-0.12.1.2-2.415.el6.x86_64.

(1) Start a rhel6.5 guest with 2 v-cpu at the beginning. 

virsh # qemu-agent-command rhel6.5 '{"execute":"guest-get-vcpus"}
{"return":[{"online":true,"can-offline":false,"logical-id":0},{"online":true,"can-offline":true,"logical-id":1}]}

virsh # 
virsh # vcpucount rhel6.5
maximum      config         8
maximum      live           8
current      config         2
current      live           2

virsh # 

virsh # qemu-agent-command rhel6.5 '{"execute":"guest-get-vcpus"}
{"return":[{"online":true,"can-offline":false,"logical-id":0},{"online":true,"can-offline":true,"logical-id":1}]}

(2) Set the vcpu to 6.

virsh # setvcpus rhel6.5 6

virsh # 
virsh # qemu-agent-command rhel6.5 '{"execute":"guest-get-vcpus"}
{"return":[{"online":true,"can-offline":false,"logical-id":0},{"online":true,"can-offline":true,"logical-id":1},{"online":true,"can-offline":true,"logical-id":2},{"online":true,"can-offline":true,"logical-id":3},{"online":true,"can-offline":true,"logical-id":4},{"online":true,"can-offline":true,"logical-id":5}]}

virsh # vcpucount rhel6.5
maximum      config         8
maximum      live           8
current      config         2
current      live           6


(3) Set v-cpu to 4.
 
virsh # setvcpus rhel6.5 4
error: Operation not supported: qemu didn't unplug the vCPUs properly

virsh # 
virsh # vcpucount rhel6.5
maximum      config         8
maximum      live           8
current      config         2
current      live           6

virsh # qemu-agent-command rhel6.5 '{"execute":"guest-get-vcpus"}
{"return":[{"online":true,"can-offline":false,"logical-id":0},{"online":true,"can-offline":true,"logical-id":1},{"online":true,"can-offline":true,"logical-id":2},{"online":true,"can-offline":true,"logical-id":3}]}

virsh # 
virsh # 

(4) Set v-cpu to 8.

virsh # setvcpus rhel6.5 8

virsh # 
virsh # vcpucount rhel6.5
maximum      config         8
maximum      live           8
current      config         2
current      live           8

virsh # 
virsh # 
virsh # qemu-agent-command rhel6.5 '{"execute":"guest-get-vcpus"}
{"return":[{"online":true,"can-offline":false,"logical-id":0},{"online":true,"can-offline":true,"logical-id":1},{"online":true,"can-offline":true,"logical-id":2},{"online":true,"can-offline":true,"logical-id":3},{"online":true,"can-offline":true,"logical-id":4},{"online":true,"can-offline":true,"logical-id":5}]}

As a result, the "guest-get-vcpus" still shows 6 v-cpu. Saw Laszlo's detail analysis in comment 11. I'll ack it first and wait for more input from develoepr about the proper solution or workaround.

Comment 14 Igor Mammedov 2014-01-15 11:05:43 UTC
Laszlo,
'setvcpus --guest' could be remedied a little if it would internally support
vcpu hot-add if "current live" was less then requested amount to bump up underling VCPUs.

as for vcpucount, perhaps there should be corresponding "vcpucount --guest" that would report onlined/offlined CPUs as 'current live'.

that would produce consistent results and wouldn't be mixed with actual VCPU hotplug commands.

Comment 15 Laszlo Ersek 2014-01-15 12:37:48 UTC
The root of the problem is that libvirt uses the "query-cpus" monitor command, which simply does not return what we care about.

Let's remember that we have the following four counts:

max_alloc >= nr of VCPU threads >= ACPI_online_count >= guest_online_count

- max_alloc is the static limit
- nr of VCPU threads is the number of VCPU threads in qemu
- ACPI_online_count is the number of VCPUs the guest kernel is *aware* of
- guest_online_count is the number of VCPUs the guest kernel chooses to enable

We need to provide the user with the following abstraction:

- max_alloc is exposed directly,

- we don't care about the number of VCPU threads in qemu at all,

- the other two are compressed down to a single value (current_alloc), which,

  - when the user queries it, returns guest_online_count (because
    simply that is what guest processes can utilize)

  - when the user sets it, sets both ACPI_online_count and
    guest_online_count

Correspondingly, what about the following algorithms for libvirt (both suggested only for the case when the guest is running):


1. "virsh vcpucount": invoke the "guest-get-vcpus" guest agent command, and return the number of the online:true entries. Done.

This is currently available via the "vcpucount --guest" command, it just needs to be documented as *THE* command to use.


2. "virsh setvcpus num_requested":

2.0. Do not use "query-cpus" at all. *Ever*.

2.1. Invoke the "guest-get-vcpus" qemu-agent command. By counting *all* returned entries, we can derive ACPI_online_count.

2.2. Set ACPI_online_count to num_requested by calling "cpu_set".
- This takes care of any new VCPU threads in qemu,
- handles ACPI enablement / disablement,
- it may leave "guest_online_count" below ACPI_online_count, which is fine at this point.

2.3. Call the "guest-get-vcpus" command again. By counting *all* returned entries, we can verify if setting ACPI_online_count has succeeded.

2.4. Flip all "online" fields in the retval of step 2.3 to "true", and call "guest-set-vcpus". This syncs guest_online_count to ACPI_online_count (ie. the guest kernel will use exactly those VCPUs that it knows about via ACPI).

This algorithm is also idempotent, you can run it as many times in succession as you want, and it works in the face of any current
(VCPU thread count, ACPI_online_count, guest_online_count) triplet.

What do you think, Peter? Thanks!

Comment 16 Peter Krempa 2014-01-15 13:36:29 UTC
(In reply to Laszlo Ersek from comment #15)
> The root of the problem is that libvirt uses the "query-cpus" monitor
> command, which simply does not return what we care about.
> 
> Let's remember that we have the following four counts:
> 
> max_alloc >= nr of VCPU threads >= ACPI_online_count >= guest_online_count
> 
> - max_alloc is the static limit
> - nr of VCPU threads is the number of VCPU threads in qemu
> - ACPI_online_count is the number of VCPUs the guest kernel is *aware* of
> - guest_online_count is the number of VCPUs the guest kernel chooses to
> enable
> 
> We need to provide the user with the following abstraction:
> 
> - max_alloc is exposed directly,
> 
> - we don't care about the number of VCPU threads in qemu at all,

Well, libvirt partially cares about the actual VCPU threads as we are pinning the vCPU threads to physical cpu threads if the user wishes to do so. Additionally the count of vCPU threads is used to count the ACPI_online_count as there is currently no other mean (not requiring the guest agent) to do so.

> - the other two are compressed down to a single value (current_alloc), which,
> 
>   - when the user queries it, returns guest_online_count (because
>     simply that is what guest processes can utilize)
> 
>   - when the user sets it, sets both ACPI_online_count and
>     guest_online_count
> 
> Correspondingly, what about the following algorithms for libvirt (both
> suggested only for the case when the guest is running):
> 
> 
> 1. "virsh vcpucount": invoke the "guest-get-vcpus" guest agent command, and
> return the number of the online:true entries. Done.
> 
> This is currently available via the "vcpucount --guest" command, it just
> needs to be documented as *THE* command to use.

I'll have to check the docs if the difference between those two values is clear enough.

> 
> 2. "virsh setvcpus num_requested":
> 
> 2.0. Do not use "query-cpus" at all. *Ever*.

Is there a different - better - command to query the ACPI cpu count available to the guest without interaction of the guest agent?

> 
> 2.1. Invoke the "guest-get-vcpus" qemu-agent command. By counting *all*
> returned entries, we can derive ACPI_online_count.
> 
> 2.2. Set ACPI_online_count to num_requested by calling "cpu_set".
> - This takes care of any new VCPU threads in qemu,
> - handles ACPI enablement / disablement,
> - it may leave "guest_online_count" below ACPI_online_count, which is fine
> at this point.
> 
> 2.3. Call the "guest-get-vcpus" command again. By counting *all* returned
> entries, we can verify if setting ACPI_online_count has succeeded.
> 
> 2.4. Flip all "online" fields in the retval of step 2.3 to "true", and call
> "guest-set-vcpus". This syncs guest_online_count to ACPI_online_count (ie.
> the guest kernel will use exactly those VCPUs that it knows about via ACPI).

Well this algorithm would be great if it wouldn't require the guest agent. As libvirt can't be assured that the guest runs a functioning guest agent that is providing accurate values and doesn't end up being stuck or providing delayed responses we have to differentiate between the two approaches to count the available CPUs. If a user wishes to utilize the agent the command is available but as with other guest agent commands we can't ensure that the returned data is accurate or that the guest did actually fulfill the requested command.
> 
> This algorithm is also idempotent, you can run it as many times in
> succession as you want, and it works in the face of any current
> (VCPU thread count, ACPI_online_count, guest_online_count) triplet.
> 
> What do you think, Peter? Thanks!

Comment 17 Laszlo Ersek 2014-01-15 15:38:24 UTC
(In reply to Peter Krempa from comment #16)

> Well, libvirt partially cares about the actual VCPU threads as we are
> pinning the vCPU threads to physical cpu threads if the user wishes to do
> so. Additionally the count of vCPU threads is used to count the
> ACPI_online_count as there is currently no other mean (not requiring the
> guest agent) to do so.

Good point. Unfortunately the number of VCPU threads and ACPI_online_count diverges as soon as at least one VCPU is removed on the ACPI level (because the number of VCPU threads can never decrease).

> > 2.0. Do not use "query-cpus" at all. *Ever*.
> 
> Is there a different - better - command to query the ACPI cpu count
> available to the guest without interaction of the guest agent?

Not to my knowledge.

If that would be preferable I can implement a RHEL-6-only QMP command that returns the ACPI bitmask. The output schema could be similar to that of guest-get-vcpu. It doesn't seem too hard to implement in qemu; the things I'll need help with are:
- designing the QMP interface so that it's acceptable for libvirt,
- agreement with the rest of the team that we won't upstream this feature
  (Igor, Markus, ...); or, ideas how to upstream it,
- if some kind of introspection would be needed before accessing the new query.

If we implement this new QMP query, then perhaps the current setvcpus virsh command could be made work with minimal updates, just rebase it to the new QMP command, from "query-cpus". Then the --guest variant could remain separate and optional.

... Actually, the *number* of ACPI online VCPUs is already available in RHEL-6 qemu, see acpi_online_cpu_count(). I could expose just the count, or the bitmap:
- VCPU nr (which is the input of cpu_set)
- VCPU apic_id
- online status

Thanks.

Comment 18 Laszlo Ersek 2014-01-17 19:19:54 UTC
I'll have to read this thread:
http://news.gmane.org/find-root.php?message_id=20140117191355.GB2221@otherpad.lan.raisama.net

Comment 19 Eduardo Habkost 2014-01-17 19:33:28 UTC
(In reply to Laszlo Ersek from comment #18)
> I'll have to read this thread:
> http://news.gmane.org/find-root.php?message_id=20140117191355.
> GB2221.raisama.net

The cpu_set code I see on RHEL-6 converts the "index" argument to the correct APIC ID. But now I need to read the discussion in this bug to understand if there are other libvirt-visible interfaces that are affected when cpu_index != apic_id.


(Note: cpu_index != apic_id if and only if the numer of threads-per-core or cores-per-socket are not powers of 2)

Comment 20 Laszlo Ersek 2014-01-17 19:47:22 UTC
The APIC IDs of VCPUs have not come up until now, because they were irrelevant in this BZ. However that changed with comment 17 -- we'll presumably introduce a (possibly RHEL-6-only) QMP command that exposes the ACPI online/offline state of VCPUs.

Internally to QEMU this accounting has ties to APIC IDs (precisely the way you stated). It might (or might not) make sense to expose APIC IDs to libvirt as well, in the retval of the QMP query, *if* there are other (unrelated) reasons why libvirt would care about APIC IDs. It's something to evaluate. It wouldn't be good to create the new query command and then modify it soon after. Thanks!

Comment 21 Eduardo Habkost 2014-01-20 13:34:28 UTC
Thanks for the explanation. Simply exposing the APIC ID shouldn't be a problem. It is probably already possible to read the APIC ID of each CPU using QOM properties. I don't know if there's already a reliable to way to map CPU indexes (the argument to cpu_set) to QOM CPU objects.

The question mentioned on qemu-devel thread at [1] is about requiring libvirt to calculate APIC IDs itself in order to add/remove VCPUs at the right socket/core/thread, instead of having a less error-prone interface.

[1] http://news.gmane.org/find-root.php?message_id=20140117191355.GB2221@otherpad.lan.raisama.net

Comment 22 Cui Lei 2014-02-26 08:09:21 UTC
*** Bug 1066481 has been marked as a duplicate of this bug. ***

Comment 23 Laszlo Ersek 2014-02-26 20:58:04 UTC
I'm about to post a RHEL-6 only patch for review. Testing:

(1) start with max_alloc=4, cur_alloc=2

virsh qemu-monitor-command seabios.rhel6 \
    '{ "execute" : "__com.redhat_query-cpu-acpi-states"}'

{"return":[{"enabled_in_acpi":false,"cpu_index":3},
           {"enabled_in_acpi":false,"cpu_index":2},
           {"enabled_in_acpi":true,"cpu_index":1},
           {"enabled_in_acpi":true,"cpu_index":0}
          ],"id":"libvirt-8"}

(2) plug in another VCPU:

virsh setvcpus --live seabios.rhel6 3
<succeeds>

(3) re-query:

virsh qemu-monitor-command seabios.rhel6 \
    '{ "execute" : "__com.redhat_query-cpu-acpi-states"}'

{"return":[{"enabled_in_acpi":false,"cpu_index":3},
           {"enabled_in_acpi":true,"cpu_index":2},
           {"enabled_in_acpi":true,"cpu_index":1},
           {"enabled_in_acpi":true,"cpu_index":0}
          ],"id":"libvirt-12"}

(4) unplug two VCPUs:

virsh setvcpus --live seabios.rhel6 1
error: Operation not supported: qemu didn't unplug the vCPUs properly

However, in the guest, the VCPU number has been correctly set. As we've seen the bug is that libvirt gets thrown off-track by the output of the "query-cpus" command.

(5) Verify that the new QMP command will be useful for libvirt:

virsh qemu-monitor-command seabios.rhel6 \
    '{ "execute" : "__com.redhat_query-cpu-acpi-states"}'

{"return":[{"enabled_in_acpi":false,"cpu_index":3},
           {"enabled_in_acpi":false,"cpu_index":2},
           {"enabled_in_acpi":false,"cpu_index":1},
           {"enabled_in_acpi":true,"cpu_index":0}
          ],"id":"libvirt-18"}

Comment 27 Laszlo Ersek 2014-02-28 17:49:39 UTC
(In reply to Laszlo Ersek from comment #23)
> I'm about to post a RHEL-6 only patch for review. Testing:
> 
> [...]

Note to QE: after Eric Blake's review of v2, I changed the fields to "enabled-in-acpi" and "cpu-index" in the QMP wire format (ie. replaced underscores with hyphens in the response struct's fields).

Comment 29 Laszlo Ersek 2014-03-04 11:40:00 UTC
v4 of the patch drops the new QMP command, and adds a new field to the output of "query-cpus" instead. Testing steps:

(1) start with max_alloc=4, cur_alloc=2

virsh qemu-monitor-command seabios.rhel6 \
    '{ "execute" : "query-cpus" }'

{
  "return":[
    {
      "enabled-in-acpi":true,
      "current":true,
      "CPU":0,
      "pc":-2130449717,
      "halted":true,
      "thread_id":16806
    },
    {
      "enabled-in-acpi":true,
      "current":false,
      "CPU":1,
      "pc":-2130449717,
      "halted":true,
      "thread_id":16809
    }
  ],
  "id":"libvirt-8"
}

(2) plug in another VCPU:

virsh setvcpus --live seabios.rhel6 3
<succeeds>

(3) re-query:

virsh qemu-monitor-command seabios.rhel6 \
    '{ "execute" : "query-cpus" }'

{
  "return":[
    {
      "enabled-in-acpi":true,
      "current":true,
      "CPU":0,
      "pc":-2130449717,
      "halted":true,
      "thread_id":16806
    },
    {
      "enabled-in-acpi":true,
      "current":false,
      "CPU":1,
      "pc":-2130449717,
      "halted":true,
      "thread_id":16809
    },
    {
      "enabled-in-acpi":true,
      "current":false,
      "CPU":2,
      "pc":-2130449717,
      "halted":true,
      "thread_id":612
    }
  ],
  "id":"libvirt-12"
}

(4) unplug two VCPUs:

virsh setvcpus --live seabios.rhel6 1
error: Operation not supported: qemu didn't unplug the vCPUs properly

However, in the guest, the VCPU number has been correctly set. As we've seen the bug is that libvirt gets thrown off-track by the output of the "query-cpus" command.

(5) Verify that the new output field will be useful for libvirt:

virsh qemu-monitor-command seabios.rhel6 \
    '{ "execute" : "query-cpus" }'

{
  "return":[
    {
      "enabled-in-acpi":true,
      "current":true,
      "CPU":0,
      "pc":-2130449717,
      "halted":true,
      "thread_id":16806
    },
    {
      "enabled-in-acpi":false,
      "current":false,
      "CPU":1,
      "pc":-2130505407,
      "halted":true,
      "thread_id":16809
    },
    {
      "enabled-in-acpi":false,
      "current":false,
      "CPU":2,
      "pc":-2130505407,
      "halted":true,
      "thread_id":612
    }
  ],
  "id":"libvirt-19"
}

Comment 32 Miroslav Rezanina 2014-03-27 09:49:50 UTC
Fix included in qemu-kvm-0.12.1.2-2.423.el6

Comment 35 Jincheng Miao 2014-04-15 09:10:30 UTC
Hi Laszlo,

There is no 'enabled-in-acpi' field in qemu-kvm-0.12.1.2-2.423.el6.x86_64:

# rpm -q qemu-kvm
qemu-kvm-0.12.1.2-2.423.el6.x86_64

# virsh qemu-monitor-command domain '{ "execute" : "query-cpus" }'
{"return":[{"current":true,"CPU":0,"pc":1609,"halted":false,"thread_id":12276},{"current":false,"CPU":1,"pc":1005735,"halted":true,"thread_id":12276}],"id":"libvirt-12"}

Neither in qemu-kvm-0.12.1.2-2.424.el6.x86_64

Comment 36 Laszlo Ersek 2014-04-15 10:21:58 UTC
I can't reproduce -- for me the -424 build seems to work fine:

# rpm -q qemu-kvm libvirt
qemu-kvm-0.12.1.2-2.424.el6.x86_64
libvirt-0.10.2-29.el6_5.7.x86_64

# virsh qemu-monitor-command seabios.rhel6 '{ "execute" : "query-cpus" }'
{"return":[{"enabled-in-acpi":true,"current":true,"CPU":0,"pc":-2130449717,"halted":true,"thread_id":4013},{"enabled-in-acpi":true,"current":false,"CPU":1,"pc":-2130449717,"halted":true,"thread_id":4014}],"id":"libvirt-10"}

What's your qemu command line? For example, if you disable acpi for the guest (-no-acpi cmdline option) then the functionality is unavailable. Did you restart the domain after updating qemu-kvm? Thanks.

Comment 37 Jincheng Miao 2014-04-16 03:12:32 UTC
(In reply to Laszlo Ersek from comment #36)
> What's your qemu command line? For example, if you disable acpi for the
> guest (-no-acpi cmdline option) then the functionality is unavailable. Did
> you restart the domain after updating qemu-kvm? Thanks.

Thanks Laszlo, the reason is as your said, I didn't enable acpi. After enabled acpi, I can get 'enabled-in-acpi'.

Comment 39 Qunfang Zhang 2014-06-25 11:52:18 UTC
Verified the bug on the following version and the result is pass.

kernel-2.6.32-487.el6.x86_64
qemu-kvm-0.12.1.2-2.428.el6.x86_64
libvirt-0.10.2-38.el6.x86_64

Steps:

1. Boot up a guest with virt-manager with CPU maximum allocation 4 and current allocation is 2. 

 3237 ?        Sl     0:41 /usr/libexec/qemu-kvm -name vm -S -M rhel6.5.0 -enable-kvm -m 2048 -realtime mlock=off -smp 2,maxcpus=4,sockets=4,cores=1,threads=1 -uuid d1e21255-abe1-bf17-9646-23b7aaa436cd -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/vm.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x5.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x5 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x5.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x5.0x2 -drive file=/home/rhel-6.6.raw,if=none,id=drive-ide0-0-0,format=raw,cache=none -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -netdev tap,fd=23,id=hostnet0 -device rtl8139,netdev=hostnet0,id=net0,mac=52:54:00:8c:3a:95,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -vnc 127.0.0.1:0 -vga cirrus -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 -msg timestamp=on

2. virsh # list 
 Id    Name                           State
----------------------------------------------------
 1     vm                             running

virsh # qemu-monitor-command vm '{ "execute" : "query-cpus" }'
{"return":[{"enabled-in-acpi":true,"current":true,"CPU":0,"pc":-2130440325,"halted":true,"thread_id":3263},{"enabled-in-acpi":true,"current":false,"CPU":1,"pc":-2130440325,"halted":true,"thread_id":3264}],"id":"libvirt-9"}


3. Plug another cpu and check it.

virsh #  setvcpus --live vm 3
virsh # 
virsh # qemu-monitor-command vm '{ "execute" : "query-cpus" }'
{"return":[{"enabled-in-acpi":true,"current":true,"CPU":0,"pc":-2130440325,"halted":true,"thread_id":3263},{"enabled-in-acpi":true,"current":false,"CPU":1,"pc":-2130440325,"halted":true,"thread_id":3264},{"enabled-in-acpi":true,"current":false,"CPU":2,"pc":-2130440325,"halted":true,"thread_id":3407}],"id":"libvirt-13"}

All the 3 vcpus "enabled-in-acpi" field is "true".

4. Plug the 4th cpu again and check it.
virsh #  setvcpus --live vm 4
virsh #
virsh # qemu-monitor-command vm '{ "execute" : "query-cpus" }'
{"return":[{"enabled-in-acpi":true,"current":true,"CPU":0,"pc":-2130440325,"halted":true,"thread_id":3263},{"enabled-in-acpi":true,"current":false,"CPU":1,"pc":-2130440325,"halted":true,"thread_id":3264},{"enabled-in-acpi":true,"current":false,"CPU":2,"pc":-2130440325,"halted":true,"thread_id":3407},{"enabled-in-acpi":true,"current":false,"CPU":3,"pc":-2130440325,"halted":true,"thread_id":3517}],"id":"libvirt-17"}

All the 4 vcpus "enabled-in-acpi" field is "true".

5. Hot unplug 2 vcpu and check it.

virsh #  setvcpus --live vm 2
virsh # 
virsh # qemu-monitor-command vm '{ "execute" : "query-cpus" }'
{"return":[{"enabled-in-acpi":true,"current":true,"CPU":0,"pc":-2130440325,"halted":true,"thread_id":3263},{"enabled-in-acpi":true,"current":false,"CPU":1,"pc":-2130440325,"halted":true,"thread_id":3264},{"enabled-in-acpi":false,"current":false,"CPU":2,"pc":-2130498735,"halted":true,"thread_id":3407},{"enabled-in-acpi":false,"current":false,"CPU":3,"pc":-2130498735,"halted":true,"thread_id":3517}],"id":"libvirt-23"}

Now, the cpu 2 and cpu 3 "enabled-in-acpi" field is "false".

6. Repeat step 4. 

virsh #  setvcpus --live vm 4

virsh # 
virsh # qemu-monitor-command vm '{ "execute" : "query-cpus" }'
{"return":[{"enabled-in-acpi":true,"current":true,"CPU":0,"pc":-2130440325,"halted":true,"thread_id":3263},{"enabled-in-acpi":true,"current":false,"CPU":1,"pc":-2130440325,"halted":true,"thread_id":3264},{"enabled-in-acpi":true,"current":false,"CPU":2,"pc":-2130440325,"halted":true,"thread_id":3407},{"enabled-in-acpi":true,"current":false,"CPU":3,"pc":-2130440325,"halted":true,"thread_id":3517}],"id":"libvirt-29"}

All the 4 vcpus "enabled-in-acpi" field is "true" again. 

So, based on above, the issue is fixed.

Comment 40 errata-xmlrpc 2014-10-14 06:52:14 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-1490.html