Bug 1834250

Summary: CPU hotplug on UEFI VM causes VM reboot
Product: [oVirt] ovirt-engine Reporter: Beni Pelled <bpelled>
Component: BLL.VirtAssignee: Arik <ahadas>
Status: CLOSED CURRENTRELEASE QA Contact: Nisim Simsolo <nsimsolo>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.4.0CC: ahadas, baptiste.agasse, bugs, dfodor, lersek, lrotenbe, mavital, michal.skrivanek, mprivozn, nsimsolo, ymankad
Target Milestone: ovirt-4.4.6Flags: pm-rhel: ovirt-4.4+
Target Release: 4.4.6.2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ovirt-engine-4.4.6.2 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-05-05 05:36:20 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Virt RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1454803, 1846886, 1849170, 1933974    
Bug Blocks:    

Description Beni Pelled 2020-05-11 11:53:27 UTC
Description of problem:
Hotplug-CPU to a UEFI VM causes the VM to restart.

Version-Release number of selected component (if applicable):
- ovirt-engine-4.4.0-0.33.master.el8ev.noarch
- vdsm-4.40.13-1.el8ev.x86_64
- libvirt-6.0.0-17.module+el8.2.0+6257+0d066c28.x86_64
- qemu-kvm-4.2.0-19.module+el8.2.0+6296+6b821950.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Create a UEFI VM (RHEL8.2 in my case) with one CPU and start the VM
2. Add another CPU (hotplug)

Actual results:
VM is restarted (and CPU successfully added)

Expected results:
CPU added without restart 

Additional info:
The same occurs with 'SecureBOOT' bios-type.

Comment 6 Michal Privoznik 2020-05-20 18:40:52 UTC
Citing from libvirtd.log. Firstly, we can see libvirt hotplugging vCPU:

2020-05-20 12:11:06.179+0000: 5180: info : qemuMonitorSend:1072 : QEMU_MONITOR_SEND_MSG: mon=0x7f8e1c02aa80 msg={"execute":"device_add","arguments":{"driver":"IvyBridge-x86_64-cpu","id":"vcpu1","core-id":0,"thread-id":0,"node-id":0,"die-id":0,"socket-id":1},"id":"libvirt-335"}
2020-05-20 12:11:06.182+0000: 5177: info : qemuMonitorJSONIOProcessLine:242 : QEMU_MONITOR_RECV_REPLY: mon=0x7f8e1c02aa80 reply={"return": {}, "id": "libvirt-335"}

2020-05-20 12:11:06.183+0000: 5180: info : qemuMonitorSend:1072 : QEMU_MONITOR_SEND_MSG: mon=0x7f8e1c02aa80 msg={"execute":"query-hotpluggable-cpus","id":"libvirt-336"}
 fd=-1
2020-05-20 12:11:06.186+0000: 5177: info : qemuMonitorJSONIOProcessLine:242 : QEMU_MONITOR_RECV_REPLY: mon=0x7f8e1c02aa80 reply={"return": [{"props": {"core-id": 0, "thread-id": 0, "node-id": 0, "die-id": 0, "socket-id": 15}, "vcpus-count": 1, "type": "IvyBridge-x86_64-cpu"}, {"props": {"core-id": 0, "thread-id": 0, "node-id": 0, "die-id": 0, "socket-id": 14}, "vcpus-count": 1, "type": "IvyBridge-x86_64-cpu"}, {"props": {"core-id": 0, "thread-id": 0, "node-id": 0, "die-id": 0, "socket-id": 13}, "vcpus-count": 1, "type": "IvyBridge-x86_64-cpu"}, {"props": {"core-id": 0, "thread-id": 0, "node-id": 0, "die-id": 0, "socket-id": 12}, "vcpus-count": 1, "type": "IvyBridge-x86_64-cpu"}, {"props": {"core-id": 0, "thread-id": 0, "node-id": 0, "die-id": 0, "socket-id": 11}, "vcpus-count": 1, "type": "IvyBridge-x86_64-cpu"}, {"props": {"core-id": 0, "thread-id": 0, "node-id": 0, "die-id": 0, "socket-id": 10}, "vcpus-count": 1, "type": "IvyBridge-x86_64-cpu"}, {"props": {"core-id": 0, "thread-id": 0, "node-id": 0, "die-id": 0, "socket-id": 9}, "vcpus-count": 1, "type": "IvyBridge-x86_64-cpu"}, {"props": {"core-id": 0, "thread-id": 0, "node-id": 0, "die-id": 0, "socket-id": 8}, "vcpus-count": 1, "type": "IvyBridge-x86_64-cpu"}, {"props": {"core-id": 0, "thread-id": 0, "node-id": 0, "die-id": 0, "socket-id": 7}, "vcpus-count": 1, "type": "IvyBridge-x86_64-cpu"}, {"props": {"core-id": 0, "thread-id": 0, "node-id": 0, "die-id": 0, "socket-id": 6}, "vcpus-count": 1, "type": "IvyBridge-x86_64-cpu"}, {"props": {"core-id": 0, "thread-id": 0, "node-id": 0, "die-id": 0, "socket-id": 5}, "vcpus-count": 1, "type": "IvyBridge-x86_64-cpu"}, {"props": {"core-id": 0, "thread-id": 0, "node-id": 0, "die-id": 0, "socket-id": 4}, "vcpus-count": 1, "type": "IvyBridge-x86_64-cpu"}, {"props": {"core-id": 0, "thread-id": 0, "node-id": 0, "die-id": 0, "socket-id": 3}, "vcpus-count": 1, "type": "IvyBridge-x86_64-cpu"}, {"props": {"core-id": 0, "thread-id": 0, "node-id": 0, "die-id": 0, "socket-id": 2}, "vcpus-count": 1, "type": "IvyBridge-x86_64-cpu"}, {"props": {"core-id": 0, "thread-id": 0, "node-id": 0, "die-id": 0, "socket-id": 1}, "vcpus-count": 1, "qom-path": "/machine/peripheral/vcpu1", "type": "IvyBridge-x86_64-cpu"}, {"props": {"core-id": 0, "thread-id": 0, "node-id": 0, "die-id": 0, "socket-id": 0}, "vcpus-count": 1, "qom-path": "/machine/unattached/device[0]", "type": "IvyBridge-x86_64-cpu"}], "id": "libvirt-336"}
2020-05-20 12:11:06.186+0000: 5180: info : qemuMonitorSend:1072 : QEMU_MONITOR_SEND_MSG: mon=0x7f8e1c02aa80 msg={"execute":"query-cpus-fast","id":"libvirt-337"}
 fd=-1

And it even receives an event that the vCPU was added:

2020-05-20 12:11:06.189+0000: 5177: info : qemuMonitorJSONIOProcessLine:237 : QEMU_MONITOR_RECV_EVENT: mon=0x7f8e1c02aa80 event={"timestamp": {"seconds": 1589976666, "microseconds": 189510}, "event": "ACPI_DEVICE_OST", "data": {"info": {"device": "vcpu1", "source": 1, "status": 0, "slot": "1", "slot-type": "CPU"}}}

2020-05-20 12:11:06.194+0000: 5177: info : qemuMonitorJSONIOProcessLine:242 : QEMU_MONITOR_RECV_REPLY: mon=0x7f8e1c02aa80 reply={"return": [{"arch": "x86", "thread-id": 5767, "props": {"core-id": 0, "thread-id": 0, "node-id": 0, "die-id": 0, "socket-id": 0}, "qom-path": "/machine/unattached/device[0]", "cpu-index": 0, "target": "x86_64"}, {"arch": "x86", "thread-id": 6009, "props": {"core-id": 0, "thread-id": 0, "node-id": 0, "die-id": 0, "socket-id": 1}, "qom-path": "/machine/peripheral/vcpu1", "cpu-index": 1, "target": "x86_64"}], "id": "libvirt-337"}


Therefore, it finishes the API with success:

2020-05-20 12:11:06.197+0000: 5180: debug : virThreadJobClear:119 : Thread 5180 (virNetServerHandleJob) finished job remoteDispatchDomainSetVcpusFlags with ret=0

Then, it even processes getXML() API:

2020-05-20 12:11:06.198+0000: 5179: debug : virThreadJobSet:94 : Thread 5179 (virNetServerHandleJob) is now running job remoteDispatchDomainGetXMLDesc
2020-05-20 12:11:06.198+0000: 5179: debug : virThreadJobClear:119 : Thread 5179 (virNetServerHandleJob) finished job remoteDispatchDomainGetXMLDesc with ret=0


But after roughly half a second it receives an event that virtio serial ports were closed in the guest:


2020-05-20 12:11:07.740+0000: 5177: info : qemuMonitorJSONIOProcessLine:237 : QEMU_MONITOR_RECV_EVENT: mon=0x7f8e1c02aa80 event={"timestamp": {"seconds": 1589976667, "microseconds": 739882}, "event": "VSERPORT_CHANGE", "data": {"open": false, "id": "channel1"}}
2020-05-20 12:11:07.740+0000: 5177: info : qemuMonitorJSONIOProcessLine:237 : QEMU_MONITOR_RECV_EVENT: mon=0x7f8e1c02aa80 event={"timestamp": {"seconds": 1589976667, "microseconds": 740049}, "event": "VSERPORT_CHANGE", "data": {"open": false, "id": "channel2"}}

And that the guest reset itself:

2020-05-20 12:11:07.770+0000: 5177: info : qemuMonitorJSONIOProcessLine:237 : QEMU_MONITOR_RECV_EVENT: mon=0x7f8e1c02aa80 event={"timestamp": {"seconds": 1589976667, "microseconds": 770542}, "event": "RESET", "data": {"guest": true, "reason": "guest-reset"}}


There is no monitor command issued by libvirt that would instruct qemu to reset, therefore, it's either a QEMU bug, or the guest OS/kernel can't handle the hotplug and thus it resets itself. Is it possible to get logs from the guest? Those might help to debug this further.

Comment 9 Michal Skrivanek 2020-06-04 13:16:26 UTC
UEFI is tech preview only, decresing Severity

also please do not open RHV bugs on non-RHV specific issues

Comment 16 Arik 2020-09-02 10:44:02 UTC
We'll block it on the oVirt side until it's supported by the platform (bz 1874519)

Comment 20 Nisim Simsolo 2021-04-19 10:33:02 UTC
Verification version:
ovirt-engine-4.4.6.3-0.8.el8ev
vdsm-4.40.60.3-1.el8ev.x86_64
qemu-kvm-5.2.0-15.module+el8.4.0+10650+50781ca0.x86_64
libvirt-daemon-7.0.0-13.module+el8.4.0+10604+5608c2b4.x86_64

Verification scenario:
1. Run RHEL 8 VM with Q35-UEFI and 1 vCPU
2. hotplug VM vCPUs to 2 vCPUs
Verify changes take effect. 
3. hotplug VM vCPUs to 16 
Verify changes take effect. 
4. Reboot VM.
After reboot is completed, Verify VM vCPUs is still 16. 
5. Repeat steps 3-4 for power off and shutdown-run VM.
6. Repeat steps 1-5 for RHEL8 VM with Q35-SecureBoot.
7. Repeat steps 1-5 for Windows 2012R2 VM with Q35-SecureBoot.

Comment 21 Sandro Bonazzola 2021-05-05 05:36:20 UTC
This bugzilla is included in oVirt 4.4.6 release, published on May 4th 2021.

Since the problem described in this bug report should be resolved in oVirt 4.4.6 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.