Bug 1741451

Summary: Failed to hot-plug vcpus
Product: Red Hat Enterprise Linux Advanced Virtualization Reporter: jiyan <jiyan>
Component: qemu-kvmAssignee: Eduardo Habkost <ehabkost>
Status: CLOSED ERRATA QA Contact: Yumei Huang <yuhuang>
Severity: unspecified Docs Contact:
Priority: high    
Version: 8.1CC: chayang, ddepaula, dyuan, jinzhao, jiyan, juzhang, lcheng, lhuang, virt-maint, weizhan, xuzhang, yalzhang, yfu
Target Milestone: rcKeywords: Automation
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: qemu-kvm-4.1.0-2.module+el8.1.0+4012+8109dd4a Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1741807 (view as bug list) Environment:
Last Closed: 2019-11-06 07:18:55 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1741807    

Description jiyan 2019-08-15 08:04:19 UTC
Description of problem:
Failed to hot-plug vcpus

Version-Release number of selected component (if applicable):
qemu-kvm-4.1.0-1.module+el8.1.0+3966+4a23dca1.x86_64
libvirt-5.6.0-1.virtcov.el8.x86_64
kernel-4.18.0-129.el8.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Prepare a running VM with the following vcpus conf
# virsh domstate test 
running

# virsh vcpucount test 
maximum      config         4
maximum      live           4
current      config         2
current      live           2

2. Hot-plug vcpu and check stap info
# virsh setvcpus test 3 --live
error: internal error: unable to execute QEMU command 'device_add': Invalid CPU die-id: 4294967295 must be in range 0:3

# stap /usr/share/doc/libvirt-docs/examples/systemtap/qemu-monitor.stp 
  0.000 begin
  2.943 > 0x7f62ec025900 {"execute":"device_add","arguments":{"driver":"Skylake-Server-IBRS-x86_64-cpu","id":"vcpu2","socket-id":2,"core-id":0,"thread-id":0},"id":"libvirt-280"}
  2.948 < 0x7f62ec025900 {"id": "libvirt-280", "error": {"class": "GenericError", "desc": "Invalid CPU die-id: 4294967295 must be in range 0:3"}}

Actual results:
As step-2 shows

Expected results:
Hot-plugging should succeed

Additional info:
Can not reproduce this issue on qemu-kvm-4.0.0-6.module+el8.1.0+3736+a2aefea3.x86_64

Comment 1 Yumei Huang 2019-08-15 08:16:50 UTC
I guess it's same issue as bug 1741151.

Comment 3 Eduardo Habkost 2019-08-15 17:40:37 UTC
It looks like libvirt isn't following the expected interface for CPU hotplug.  Understadable, as the documentation is sparse, and the paragraph below is easy to miss.

Documentation for CpuInstanceProperties says:

# Note: currently there are 5 properties that could be present
# but management should be prepared to pass through other
# properties with device_add command to allow for future
# interface extension. This also requires the filed names to be kept in
# sync with the properties passed to -device/device_add.

If query-hotpluggable-cpus returns die-id=0 (which is the case in QEMU 4.1), libvirt should include die-id=0 in -device and device_add.

We could make the interface a bit more flexible, though, and make die-id optional if there's only one possible value for it.

I will submit a patch to QEMU upstream to make it more flexible, but I suggest we also change libvirt to follow the quoted paragraph above, and copy every single property from query-hotpluggable-cpus[].props.

Comment 4 Eduardo Habkost 2019-08-15 17:42:57 UTC
(In reply to jiyan from comment #0)
> # virsh setvcpus test 3 --live
> error: internal error: unable to execute QEMU command 'device_add': Invalid
> CPU die-id: 4294967295 must be in range 0:3

This error message is incorrect.  In this case, die-id must be in range 0:0.  I'm tracking this issue at bz#1741151.

Comment 5 Eduardo Habkost 2019-08-15 19:23:38 UTC
Fix submitted upstream:
https://lore.kernel.org/qemu-devel/20190815183803.13346-4-ehabkost@redhat.com/

Comment 6 juzhang 2019-08-16 02:30:18 UTC
(In reply to Eduardo Habkost from comment #4)
> (In reply to jiyan from comment #0)
> > # virsh setvcpus test 3 --live
> > error: internal error: unable to execute QEMU command 'device_add': Invalid
> > CPU die-id: 4294967295 must be in range 0:3
> 
> This error message is incorrect.  In this case, die-id must be in range 0:0.
> I'm tracking this issue at bz#1741151.

If so, should we close this bz and track bz1741151?

Best regards,

Junyi

Comment 7 Eduardo Habkost 2019-08-16 14:01:13 UTC
(In reply to juzhang from comment #6)
> > This error message is incorrect.  In this case, die-id must be in range 0:0.
> > I'm tracking this issue at bz#1741151.
> 
> If so, should we close this bz and track bz1741151?

I prefer like to keep this one open because it's about the most serious issue (libvirt integration failing).

bz 1741151 can be used to track the less serious issues with the error messages

Comment 13 Yumei Huang 2019-08-28 08:43:30 UTC
Verify:
qemu-kvm-4.1.0-5.module+el8.1.0+4076+b5e41ebc
libvirt-client-5.6.0-2.module+el8.1.0+4015+63576633.x86_64
kernel-4.18.0-131.el8.x86_64

Test with virsh, hotplug vcpus successfully, and guest work well.

# virsh start rhel8
Domain rhel8 started

# virsh domstate  rhel8
running

# virsh vcpucount  rhel8
maximum      config         8
maximum      live           8
current      config         4
current      live           4

# virsh setvcpus rhel8 6 --live

#  virsh vcpucount  rhel8
maximum      config         8
maximum      live           8
current      config         4
current      live           6

# virsh domstate  rhel8
running

Comment 15 errata-xmlrpc 2019-11-06 07:18:55 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:3723