Bug 1247578

Summary: [Docs] VFIO/hostdev_passthrough: Host reboot occur when powering off VM with GPU attached.
Product: Red Hat Enterprise Virtualization Manager Reporter: Nisim Simsolo <nsimsolo>
Component: DocumentationAssignee: rhev-docs <rhev-docs>
Status: CLOSED DUPLICATE QA Contact: rhev-docs <rhev-docs>
Severity: high Docs Contact:
Priority: high    
Version: 3.6.0CC: bugs, chayang, gklein, hhuang, huding, istein, juzhang, knoel, lbopf, lsurette, mavital, mgoldboi, michal.skrivanek, mpoledni, nsimsolo, rbalakri, rhev-docs, virt-maint, xfu, yeylon, ykaul, ylavi
Target Milestone: ovirt-3.6.3   
Target Release: 3.6.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-02-16 03:56:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Docs RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 825045, 1154205, 1172230    
Attachments:
Description Flags
log collector
none
sosreport 7.2
none
engine.log 02/09/2015
none
vdsm.log 02/09/2015 none

Description Nisim Simsolo 2015-07-28 11:27:25 UTC
Description of problem:
When powering off VM with GPU attached the host reboot occur.

engine.log: 
2015-07-28 13:48:30,037 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-8-thread-16) [6939aa9e] Correlation ID: 6939aa9e, Job ID: ed9fff2e-de13-44a5-8
8c9-041502a78a8f, Call Stack: null, Custom Event ID: -1, Message: Failed to power off VM AMD_win7 (Host: amd-vfio, User: admin@internal).
2015-07-28 13:48:32,029 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-67) [] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Me
ssage: VDSM 'amd-vfio' command failed: Heartbeat exeeded
2015-07-28 13:48:32,030 WARN  [org.ovirt.engine.core.vdsbroker.VdsManager] (org.ovirt.thread.pool-8-thread-20) [] Host 'amd-vfio' is not responding.
2015-07-28 13:48:32,030 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStatusVDSCommand] (DefaultQuartzScheduler_Worker-67) [] Command 'SpmStatusVDSCommand(HostName = amd-vfio, SpmStatusVDSCommandP
arameters:{runAsync='true', hostId='50f9f868-7449-4d7f-a451-91dd2b46a463', storagePoolId='00000001-0001-0001-0001-0000000001cd'})' execution failed: VDSGenericException: VDSNetworkException: Heartbeat e
xeeded
2015-07-28 13:48:32,051 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-8-thread-20) [] Correlation ID: null, Call Stack: null, Custom Event ID: -1, M
essage: Host amd-vfio is not responding. Host cannot be fenced automatically because power management for the host is disabled.


Version-Release number of selected component (if applicable):
engine: ovirt-engine-3.6.0-0.0.master.20150627185750.git6f063c1.el6.noarch

Host (kernel 3.10.0-229.el7.x86_64):
vdsm-4.17.0-1054.git562e711.el7.noarch
sanlock-3.2.2-2.el7.x86_64
libvirt-client-1.2.8-16.el7_1.3.x86_64
qemu-kvm-ev-2.1.2-23.el7_1.3.1.x86_64

Hardware:
AMD based desktop
GPU - NVIDIA Corporation GM107GL [Quadro K2200] (rev a2)
CPU - AMD FX(tm)-8350 Eight-Core Processor
Motherboard: Asus SABERTOOTH 990FX R2.0

How reproducible:
Consistently

Steps to Reproduce:
1. Create windows 7 VM.
2. Attach GPU to VM (doesn't matter if GPU audio device is attached also or not).
3. Run VM.
4. Wait for VM to run and then power off VM from webadmin UI.

Actual results:
Host reboot occur.

Expected results:
VM should be powered off without affecting the host.

Additional info:
log collector file attached.

Comment 1 Nisim Simsolo 2015-07-28 11:42:21 UTC
Created attachment 1056999 [details]
log collector

Comment 2 Martin Polednik 2015-07-29 13:26:58 UTC
There is not a lot of info regarding the reboot in the logs. One relevant information is that the guest is RHEL 6.7, can you also add results with another guest OS?

There is not any explicit reboot from VDSM's side, the failure seems to be lower in the stack.

Comment 3 Michal Skrivanek 2015-07-29 13:57:37 UTC
moving to qemu-kvm team for investigation of host crash

Comment 5 Alex Williamson 2015-08-03 19:38:19 UTC
(In reply to Nisim Simsolo from comment #0)
> Description of problem:
> When powering off VM with GPU attached the host reboot occur.
... 
> Version-Release number of selected component (if applicable):
> engine: ovirt-engine-3.6.0-0.0.master.20150627185750.git6f063c1.el6.noarch
> 
> Host (kernel 3.10.0-229.el7.x86_64):
> vdsm-4.17.0-1054.git562e711.el7.noarch
> sanlock-3.2.2-2.el7.x86_64
> libvirt-client-1.2.8-16.el7_1.3.x86_64
> qemu-kvm-ev-2.1.2-23.el7_1.3.1.x86_64
> 
> Hardware:
> AMD based desktop
> GPU - NVIDIA Corporation GM107GL [Quadro K2200] (rev a2)
> CPU - AMD FX(tm)-8350 Eight-Core Processor
> Motherboard: Asus SABERTOOTH 990FX R2.0
> 
> How reproducible:
> Consistently
> 
> Steps to Reproduce:
> 1. Create windows 7 VM.
> 2. Attach GPU to VM (doesn't matter if GPU audio device is attached also or
> not).
> 3. Run VM.
> 4. Wait for VM to run and then power off VM from webadmin UI.
> 
> Actual results:
> Host reboot occur.
> 
> Expected results:
> VM should be powered off without affecting the host.

Ok, so we have a RHEL7.1 host and a Windows 7 VM...

(In reply to Nisim Simsolo from comment #1)
> Created attachment 1056999 [details]
> log collector

But these are logs from a RHEL6.7 guest.  How are they related?

I'm unable to reproduce with a Quadro K4000 in a Gigabyte 990FX/Phenom system.  The GPU in the guest works as expected and abruptly powering off the VM does not cause a host crash or reboot.

Please provide guest XML and libvirt log for the domain.  Log collection on the host would be useful as well.

Comment 6 Nisim Simsolo 2015-08-20 11:43:07 UTC
Created attachment 1065215 [details]
sosreport 7.2

Comment 7 Nisim Simsolo 2015-08-20 11:48:07 UTC
Occurred again. this time i can confirm this bug is relevant only for linux VMs (so far i tested it only on rhel7).
Same issue using windows VM does not occur.
Setup versions:
engine: 
3.6.0-0.11.master.el6
host: 
vdsm-4.17.2-1.el7ev.noarch
sanlock-3.2.4-1.el7.x86_64
qemu-kvm-rhev-2.3.0-18.el7.x86_64
libvirt-client-1.2.17-4.el7.x86_64

Issue occurred at : 2015-Aug-20, 14:06
VM name: rhel7_amd 
Host: amd-vfio

Comment 8 Alex Williamson 2015-08-26 15:30:08 UTC
There are at least two problems evident here, first we only support assignment of secondary graphics to a guest, in the provided sosreport, the K2200 is the only graphics device in the host system.  Second, we recommend to customers to use pci-stub.ids= to prevent host drivers from binding to GPUs intended for assignment.  The dmesg clearly shows nouveau in use by the host.  In addition to adding another graphics card for host primary graphics, the option pci-stub.ids=10de:13ba,10de:0fbc should be added to the host kernel commandline to prevent host drivers from using the assigned GPU.

Also, only the Nvidia proprietary drivers are supported in the guest.  The nouveau driver is not supported and should be blacklisted in the guest.  There's no indication here in the bz what driver is being used in the guest (which appears to be rhel6, not rhel7).

Comment 9 Alex Williamson 2015-08-26 21:52:29 UTC
Tested RHEL6.7 and RHEL7.1 guests on a RHEL7.2 AMD 990FX based host with Quadro K4000 assignment, binding to pci-stub in host, as recommended to customers, and blacklisting nouveau in guest, using only the latest driver directly from NVIDIA in the guest.  I cannot reproduce a host reboot.  When powering off the VM, nothing out of the ordinary happens.  If I destroy the VM, I get spurious interrupts on the host from the legacy interrupt, nothing more.

Comment 10 Nisim Simsolo 2015-09-02 12:51:13 UTC
occurred again, this time using host with RHEL7.2 Beta (Maipo) and kernel 3.10.0-306.0.1.el7.x86_64.
host builds:
qemu-kvm-rhev-2.3.0-21.el7.x86_64
vdsm-4.17.3-1.el7ev.noarch
sanlock-3.2.4-1.el7.x86_64
libvirt-client-1.2.17-5.el7.x86_64
engine:
rhevm-3.6.0-0.12.master.el6

Host is AMD host with Nvidia Quadro K2200, attached to windows 7 VM.
Trying to power off VM ended with host reboot (does not always happen).
VM name is: win7_amd
issue started at: 2015-09-02 14:47:04,434 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-7-thread-47) [430aa112] Correlation ID: 430aa112, Job ID: 5cfa9708-9601-467c-86fe-289869023034, Call Stack: null, Custom Event ID: -1, Message: Failed to power off VM win7_amd (Host: amd-vfio.tlv.redhat.com, User: admin@internal).
engine.log and VDSM log attached.
SOS report uploaded to my google drive: https://drive.google.com/a/redhat.com/folderview?id=0B0zN-i4uOuoBfkVQQTctcEJ2QjVzejlqa0VTbWJXa2pSendleUJ0UWNJOElDR18tWUlwcGs&usp=sharing

As for PCI stub, GPU is not binded to PCI stub because there is no requirement for doing it in feature page. but GPU is not binded to nouveau as well.

Comment 11 Nisim Simsolo 2015-09-02 12:53:56 UTC
Created attachment 1069416 [details]
engine.log 02/09/2015

Comment 12 Nisim Simsolo 2015-09-02 12:55:20 UTC
Created attachment 1069417 [details]
vdsm.log 02/09/2015

Comment 13 Alex Williamson 2015-09-02 14:16:48 UTC
(In reply to Nisim Simsolo from comment #10)
> As for PCI stub, GPU is not binded to PCI stub because there is no
> requirement for doing it in feature page. but GPU is not binded to nouveau
> as well.

Then the feature page is wrong, nouveau must be avoided in the host and is unsupported in the guest.  Only assignment of secondary devices in the host is supported.  Until these configuration issues are resolved, this bug is not worth investigating.

Comment 14 Nisim Simsolo 2015-09-02 14:56:54 UTC
Martin, According to feature page (http://www.ovirt.org/Features/hostdev_passthrough):
"The detach_detachable() call takes care of detaching the device from host (unbinding it from current drivers and binding to vfio, or pci-stub if old KVM is used - this behaviour is handled by libvirt's detachFlags call) and correctly setting permissions for /dev/vfio iommu group endpoint."

If there is a need for unbinding GPU from host kernel driver and bind it to pci-stub, and also add nouveau to blacklist, feature page should be updated accordingly.

Comment 15 Alex Williamson 2015-09-02 15:07:54 UTC
GPUs are unique, host drivers are not keen to release the device and acting as the primary graphics device on the host complicates things further.  The nouveau driver in the guest is not supported and really has no valid use case for a customer.  Additionally nouveau has been known to trigger issues resulting in host crashes, especially on newer cards that are not well supported by nouveau.

Finally, if we want to have any hope of diagnosing a host reboot, please provide a serial console log from the host system or at least a crash dump.  AFAICT, there is no information relevant to the host kernel reboot in any of the provided logs.

Comment 16 Martin Polednik 2015-09-02 15:24:34 UTC
Alex, thanks for explanation. As far as I understand, everything should be fine except for GPU where we will have additional instructions to manually append pci-stub.ids to cmdline and possibly blacklist the nouveau driver.

Nisim, I'll update the wiki as soon as I'm sure how to formulate it.

Comment 17 Martin Polednik 2015-09-16 12:25:51 UTC
Added to wiki with reference to vfio-pci blog:

http://www.ovirt.org/Features/hostdev_passthrough#GPU_passthrough

Comment 18 Omer Frenkel 2015-09-16 14:20:56 UTC
No code change needed,but we need to update docs with this info

Comment 19 Yaniv Kaul 2015-09-17 14:30:29 UTC
Omer - how do we expect users to perform any of the above in RHEVH?

Comment 20 Omer Frenkel 2015-09-17 14:52:15 UTC
The host type (rhel/rhevh) doesn't matter, all the configuration is done through rhevm (ui/api), once the hardware support IOMMU, user can see what devices are available on the host, and attach them to vms.

Comment 21 Yaniv Kaul 2015-09-17 15:14:48 UTC
Omer - http://www.ovirt.org/Features/hostdev_passthrough#GPU_passthrough seems to suggest that for GPU a bit of extra effort is required.

Comment 22 Michal Skrivanek 2015-09-18 12:48:23 UTC
(In reply to Yaniv Kaul from comment #21)
> Omer - http://www.ovirt.org/Features/hostdev_passthrough#GPU_passthrough
> seems to suggest that for GPU a bit of extra effort is required.

yes, there is additional manual configuration, which can be especially tricky on rhevh
iommu kernel parameter needs to be added for any vfio support (we have https://gerrit.ovirt.org/#/c/41507 but there was not much enthusiasm about it, so for now this is manual)
specific nvidia options, blacklisting, need to be added to both host and guest for nvidia GPU passthough

Comment 23 Michal Skrivanek 2015-09-18 12:58:56 UTC
Nisim, can you confirm it works for you following the additional configuration?

Comment 24 Nisim Simsolo 2015-10-07 08:53:13 UTC
The procedure in feature page is not working and unclear (I expect more detailed procedure such as where is kernel cmdline located)
Follwing the next steps, according to feature page, shows that GPU is still attached to nouveau:
#lspci -n
-Find GPU controller and audio device:
01:00.0 VGA compatible controller: NVIDIA Corporation GM107GL [Quadro K2200] (rev a2)
01:00.1 Audio device: NVIDIA Corporation Device 0fbc (rev a1)
# lspci -n -s 01:00
01:00.0 0300: 10de:13ba (rev a2)
01:00.1 0403: 10de:0fbc (rev a1)
-Add vendor:device ids to kernel cmdline: 
#vi /etc/default/grub
- Add the next line to GRUB_CMDLINE_LINUX:  pci-stub.ids=10de:13ba,10de:0fbc
- In my case, the whole line looks like this: GRUB_CMDLINE_LINUX="nofb splash=quiet console=tty0 console=ttyS0,115200 crashkernel=auto biosdevname=0 rhgb quiet amd_iommu=on pci-stub.ids=10de:13ba,10de:0fbc"

- Refresh grub config: 
#grub2-mkconfig
- Reboot host
#lspci -nnk:
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GM107GL [Quadro K2200] [10de:13ba] (rev a2)
	Subsystem: NVIDIA Corporation Device [10de:1097]
	Kernel driver in use: nouveau

As you can see, GPU is not detached from nouveau as expected.

Comment 25 Michal Skrivanek 2015-10-07 10:31:56 UTC
(In reply to Nisim Simsolo from comment #24)

> As you can see, GPU is not detached from nouveau as expected.

Hm. I'm afraid there is some issue with either your guest or the procedure indeed. Still, this doesn't really help with passthrough testing as detaching NVIDIA from nouveau driver is a prerequisite

Comment 26 Martin Polednik 2015-10-08 10:28:35 UTC
Updated the wiki with better explanation for driver blacklisting.

Comment 27 Nisim Simsolo 2015-10-08 10:36:30 UTC
Bug verified this time using correct pci-stub procedure and nouveau black listing (as explained in wiki) and on different VM OS (win and linux).
Exact scenario of pci stubbing and blacklisting added also to test plan setup preparation.

Verification version: 
rhevm-3.6.0-0.18.el6
sanlock-3.2.4-1.el7.x86_64
vdsm-4.17.8-1.el7ev.noarch
qemu-kvm-rhev-2.3.0-24.el7.x86_64
libvirt-client-1.2.17-5.el7.x86_64

Comment 28 Ilanit Stein 2015-10-08 12:16:56 UTC
Martin,

Would you please add here a link to the relevant documentation, or the documentation content itself, that you would have liked to be added to the admin guide?

Thanks,
Ilanit.

Comment 29 Michal Skrivanek 2015-10-23 14:02:00 UTC
http://www.ovirt.org/Features/hostdev_passthrough#GPU_passthrough has now more detailed information useful to the doc team

Comment 30 Sandro Bonazzola 2015-10-26 12:47:37 UTC
this is an automated message. oVirt 3.6.0 RC3 has been released and GA is targeted to next week, Nov 4th 2015.
Please review this bug and if not a blocker, please postpone to a later release.
All bugs not postponed on GA release will be automatically re-targeted to

- 3.6.1 if severity >= high
- 4.0 if severity < high

Comment 31 Lucy Bopf 2016-02-16 03:56:45 UTC
Documentation around GPU passthrough and related limitations is being tracked in bug 1285799. I am closing this bug as a duplicate.

*** This bug has been marked as a duplicate of bug 1285799 ***