Bug 1481007
Summary: | vGPU: VMs with mdev_type hook failed to run after RHV upgrade, even if the hook removed. | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [oVirt] ovirt-engine | Reporter: | Nisim Simsolo <nsimsolo> | ||||||||
Component: | BLL.Virt | Assignee: | Arik <ahadas> | ||||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Nisim Simsolo <nsimsolo> | ||||||||
Severity: | urgent | Docs Contact: | |||||||||
Priority: | high | ||||||||||
Version: | 4.2.0 | CC: | ahadas, bugs, nsimsolo, tjelinek | ||||||||
Target Milestone: | ovirt-4.2.0 | Flags: | rule-engine:
ovirt-4.2+
|
||||||||
Target Release: | --- | ||||||||||
Hardware: | Unspecified | ||||||||||
OS: | Unspecified | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2017-12-20 11:37:05 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | Virt | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | |||||||||||
Bug Blocks: | 1486524 | ||||||||||
Attachments: |
|
Description
Nisim Simsolo
2017-08-13 12:40:18 UTC
Created attachment 1312676 [details]
VM_devices screeshot
Created attachment 1312677 [details]
vdsm.log
Created attachment 1312678 [details]
engine.log
Something is wrong with the vdsm log attached. I can't find messages related to the VM '7feae268-6669-4ac4-920f-7177a43d7acd'. Anyway, in this case, it would be best to look at the system while it happens. Can you please try to reproduce it and call me to have a look? I have an "old" VM with more than 1 mdev type device that failed to run. Please contact me when you can. Nisim, so it doesn't happen anymore on the master branch. Could you verify that it happens in 4.1? Thanks. It happens (mdev type is added to vm_devices with the same uuid) in 2 cases: 1. When removing mdev_type hook and adding another one. 2. After upgrading setup from 4.1.5-2 to 4.1.6-4 In both cases VM is running properly with Nvidia instance attached to VM. targeting this to 4.2 since in 4.1 the issue is only that the vm devices subtab can show more devices than the VM actually has. Verification builds: ovirt-engine-4.2.0-0.0.master.20171002190603.git3015ada.el7.centos libvirt-client-3.2.0-14.el7_4.3.x86_64 qemu-kvm-rhev-2.9.0-16.el7_4.8.x86_64 vdsm-4.20.3-128.git52f2c60.el7.centos.x86_64 vdsm-hook-vfio-mdev-4.20.3-128.git52f2c60.el7.centos NVIDIA-Linux-x86_64-384.37-vgpu-kvm Verification scenario: 1. Run VM with mdev_type hook. 2. Upgrade setup. 3. Verify VM is still running. verify only 1 mdev device is listed undev VM -> VM devices. 4. Power off and run VM again. 5. Verify Vm is running properly with Nvidia instance. 6. Import VM with multiple mdev VM devices from export domain (I've exported such problematic VM to export domain when this bug created). 7. Run VM and verify VM is running properly with Nvidia instance. Browse webadmin -> Virtual machines -> select imported VM -> VM devices, verify only 1 mdev device is now listed (the multiple old ones are actually removed). This bugzilla is included in oVirt 4.2.0 release, published on Dec 20th 2017. Since the problem described in this bug report should be resolved in oVirt 4.2.0 release, published on Dec 20th 2017, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report. |