Bug 1834250 - CPU hotplug on UEFI VM causes VM reboot
Summary: CPU hotplug on UEFI VM causes VM reboot
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Virt
Version: 4.4.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ovirt-4.4.6
: 4.4.6.2
Assignee: Arik
QA Contact: Nisim Simsolo
URL:
Whiteboard:
Depends On: 1454803 1846886 1849170 1933974
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-05-11 11:53 UTC by Beni Pelled
Modified: 2021-05-05 05:36 UTC (History)
11 users (show)

Fixed In Version: ovirt-engine-4.4.6.2
Clone Of:
Environment:
Last Closed: 2021-05-05 05:36:20 UTC
oVirt Team: Virt
Embargoed:
pm-rhel: ovirt-4.4+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 111079 0 master MERGED core: block cpu hotplug on uefi 2021-02-07 21:02:51 UTC
oVirt gerrit 114010 0 master MERGED Revert "core: block cpu hotplug on uefi" 2021-03-25 09:40:15 UTC

Description Beni Pelled 2020-05-11 11:53:27 UTC
Description of problem:
Hotplug-CPU to a UEFI VM causes the VM to restart.

Version-Release number of selected component (if applicable):
- ovirt-engine-4.4.0-0.33.master.el8ev.noarch
- vdsm-4.40.13-1.el8ev.x86_64
- libvirt-6.0.0-17.module+el8.2.0+6257+0d066c28.x86_64
- qemu-kvm-4.2.0-19.module+el8.2.0+6296+6b821950.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Create a UEFI VM (RHEL8.2 in my case) with one CPU and start the VM
2. Add another CPU (hotplug)

Actual results:
VM is restarted (and CPU successfully added)

Expected results:
CPU added without restart 

Additional info:
The same occurs with 'SecureBOOT' bios-type.

Comment 6 Michal Privoznik 2020-05-20 18:40:52 UTC
Citing from libvirtd.log. Firstly, we can see libvirt hotplugging vCPU:

2020-05-20 12:11:06.179+0000: 5180: info : qemuMonitorSend:1072 : QEMU_MONITOR_SEND_MSG: mon=0x7f8e1c02aa80 msg={"execute":"device_add","arguments":{"driver":"IvyBridge-x86_64-cpu","id":"vcpu1","core-id":0,"thread-id":0,"node-id":0,"die-id":0,"socket-id":1},"id":"libvirt-335"}
2020-05-20 12:11:06.182+0000: 5177: info : qemuMonitorJSONIOProcessLine:242 : QEMU_MONITOR_RECV_REPLY: mon=0x7f8e1c02aa80 reply={"return": {}, "id": "libvirt-335"}

2020-05-20 12:11:06.183+0000: 5180: info : qemuMonitorSend:1072 : QEMU_MONITOR_SEND_MSG: mon=0x7f8e1c02aa80 msg={"execute":"query-hotpluggable-cpus","id":"libvirt-336"}
 fd=-1
2020-05-20 12:11:06.186+0000: 5177: info : qemuMonitorJSONIOProcessLine:242 : QEMU_MONITOR_RECV_REPLY: mon=0x7f8e1c02aa80 reply={"return": [{"props": {"core-id": 0, "thread-id": 0, "node-id": 0, "die-id": 0, "socket-id": 15}, "vcpus-count": 1, "type": "IvyBridge-x86_64-cpu"}, {"props": {"core-id": 0, "thread-id": 0, "node-id": 0, "die-id": 0, "socket-id": 14}, "vcpus-count": 1, "type": "IvyBridge-x86_64-cpu"}, {"props": {"core-id": 0, "thread-id": 0, "node-id": 0, "die-id": 0, "socket-id": 13}, "vcpus-count": 1, "type": "IvyBridge-x86_64-cpu"}, {"props": {"core-id": 0, "thread-id": 0, "node-id": 0, "die-id": 0, "socket-id": 12}, "vcpus-count": 1, "type": "IvyBridge-x86_64-cpu"}, {"props": {"core-id": 0, "thread-id": 0, "node-id": 0, "die-id": 0, "socket-id": 11}, "vcpus-count": 1, "type": "IvyBridge-x86_64-cpu"}, {"props": {"core-id": 0, "thread-id": 0, "node-id": 0, "die-id": 0, "socket-id": 10}, "vcpus-count": 1, "type": "IvyBridge-x86_64-cpu"}, {"props": {"core-id": 0, "thread-id": 0, "node-id": 0, "die-id": 0, "socket-id": 9}, "vcpus-count": 1, "type": "IvyBridge-x86_64-cpu"}, {"props": {"core-id": 0, "thread-id": 0, "node-id": 0, "die-id": 0, "socket-id": 8}, "vcpus-count": 1, "type": "IvyBridge-x86_64-cpu"}, {"props": {"core-id": 0, "thread-id": 0, "node-id": 0, "die-id": 0, "socket-id": 7}, "vcpus-count": 1, "type": "IvyBridge-x86_64-cpu"}, {"props": {"core-id": 0, "thread-id": 0, "node-id": 0, "die-id": 0, "socket-id": 6}, "vcpus-count": 1, "type": "IvyBridge-x86_64-cpu"}, {"props": {"core-id": 0, "thread-id": 0, "node-id": 0, "die-id": 0, "socket-id": 5}, "vcpus-count": 1, "type": "IvyBridge-x86_64-cpu"}, {"props": {"core-id": 0, "thread-id": 0, "node-id": 0, "die-id": 0, "socket-id": 4}, "vcpus-count": 1, "type": "IvyBridge-x86_64-cpu"}, {"props": {"core-id": 0, "thread-id": 0, "node-id": 0, "die-id": 0, "socket-id": 3}, "vcpus-count": 1, "type": "IvyBridge-x86_64-cpu"}, {"props": {"core-id": 0, "thread-id": 0, "node-id": 0, "die-id": 0, "socket-id": 2}, "vcpus-count": 1, "type": "IvyBridge-x86_64-cpu"}, {"props": {"core-id": 0, "thread-id": 0, "node-id": 0, "die-id": 0, "socket-id": 1}, "vcpus-count": 1, "qom-path": "/machine/peripheral/vcpu1", "type": "IvyBridge-x86_64-cpu"}, {"props": {"core-id": 0, "thread-id": 0, "node-id": 0, "die-id": 0, "socket-id": 0}, "vcpus-count": 1, "qom-path": "/machine/unattached/device[0]", "type": "IvyBridge-x86_64-cpu"}], "id": "libvirt-336"}
2020-05-20 12:11:06.186+0000: 5180: info : qemuMonitorSend:1072 : QEMU_MONITOR_SEND_MSG: mon=0x7f8e1c02aa80 msg={"execute":"query-cpus-fast","id":"libvirt-337"}
 fd=-1

And it even receives an event that the vCPU was added:

2020-05-20 12:11:06.189+0000: 5177: info : qemuMonitorJSONIOProcessLine:237 : QEMU_MONITOR_RECV_EVENT: mon=0x7f8e1c02aa80 event={"timestamp": {"seconds": 1589976666, "microseconds": 189510}, "event": "ACPI_DEVICE_OST", "data": {"info": {"device": "vcpu1", "source": 1, "status": 0, "slot": "1", "slot-type": "CPU"}}}

2020-05-20 12:11:06.194+0000: 5177: info : qemuMonitorJSONIOProcessLine:242 : QEMU_MONITOR_RECV_REPLY: mon=0x7f8e1c02aa80 reply={"return": [{"arch": "x86", "thread-id": 5767, "props": {"core-id": 0, "thread-id": 0, "node-id": 0, "die-id": 0, "socket-id": 0}, "qom-path": "/machine/unattached/device[0]", "cpu-index": 0, "target": "x86_64"}, {"arch": "x86", "thread-id": 6009, "props": {"core-id": 0, "thread-id": 0, "node-id": 0, "die-id": 0, "socket-id": 1}, "qom-path": "/machine/peripheral/vcpu1", "cpu-index": 1, "target": "x86_64"}], "id": "libvirt-337"}


Therefore, it finishes the API with success:

2020-05-20 12:11:06.197+0000: 5180: debug : virThreadJobClear:119 : Thread 5180 (virNetServerHandleJob) finished job remoteDispatchDomainSetVcpusFlags with ret=0

Then, it even processes getXML() API:

2020-05-20 12:11:06.198+0000: 5179: debug : virThreadJobSet:94 : Thread 5179 (virNetServerHandleJob) is now running job remoteDispatchDomainGetXMLDesc
2020-05-20 12:11:06.198+0000: 5179: debug : virThreadJobClear:119 : Thread 5179 (virNetServerHandleJob) finished job remoteDispatchDomainGetXMLDesc with ret=0


But after roughly half a second it receives an event that virtio serial ports were closed in the guest:


2020-05-20 12:11:07.740+0000: 5177: info : qemuMonitorJSONIOProcessLine:237 : QEMU_MONITOR_RECV_EVENT: mon=0x7f8e1c02aa80 event={"timestamp": {"seconds": 1589976667, "microseconds": 739882}, "event": "VSERPORT_CHANGE", "data": {"open": false, "id": "channel1"}}
2020-05-20 12:11:07.740+0000: 5177: info : qemuMonitorJSONIOProcessLine:237 : QEMU_MONITOR_RECV_EVENT: mon=0x7f8e1c02aa80 event={"timestamp": {"seconds": 1589976667, "microseconds": 740049}, "event": "VSERPORT_CHANGE", "data": {"open": false, "id": "channel2"}}

And that the guest reset itself:

2020-05-20 12:11:07.770+0000: 5177: info : qemuMonitorJSONIOProcessLine:237 : QEMU_MONITOR_RECV_EVENT: mon=0x7f8e1c02aa80 event={"timestamp": {"seconds": 1589976667, "microseconds": 770542}, "event": "RESET", "data": {"guest": true, "reason": "guest-reset"}}


There is no monitor command issued by libvirt that would instruct qemu to reset, therefore, it's either a QEMU bug, or the guest OS/kernel can't handle the hotplug and thus it resets itself. Is it possible to get logs from the guest? Those might help to debug this further.

Comment 9 Michal Skrivanek 2020-06-04 13:16:26 UTC
UEFI is tech preview only, decresing Severity

also please do not open RHV bugs on non-RHV specific issues

Comment 16 Arik 2020-09-02 10:44:02 UTC
We'll block it on the oVirt side until it's supported by the platform (bz 1874519)

Comment 20 Nisim Simsolo 2021-04-19 10:33:02 UTC
Verification version:
ovirt-engine-4.4.6.3-0.8.el8ev
vdsm-4.40.60.3-1.el8ev.x86_64
qemu-kvm-5.2.0-15.module+el8.4.0+10650+50781ca0.x86_64
libvirt-daemon-7.0.0-13.module+el8.4.0+10604+5608c2b4.x86_64

Verification scenario:
1. Run RHEL 8 VM with Q35-UEFI and 1 vCPU
2. hotplug VM vCPUs to 2 vCPUs
Verify changes take effect. 
3. hotplug VM vCPUs to 16 
Verify changes take effect. 
4. Reboot VM.
After reboot is completed, Verify VM vCPUs is still 16. 
5. Repeat steps 3-4 for power off and shutdown-run VM.
6. Repeat steps 1-5 for RHEL8 VM with Q35-SecureBoot.
7. Repeat steps 1-5 for Windows 2012R2 VM with Q35-SecureBoot.

Comment 21 Sandro Bonazzola 2021-05-05 05:36:20 UTC
This bugzilla is included in oVirt 4.4.6 release, published on May 4th 2021.

Since the problem described in this bug report should be resolved in oVirt 4.4.6 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.