Bug 1036577

Summary: host kernel panic when assigning Nvidia GPU device to linux guest
Product: Red Hat Enterprise Linux 7 Reporter: FuXiangChun <xfu>
Component: qemu-kvmAssignee: Virtualization Maintenance <virt-maint>
Status: CLOSED DUPLICATE QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: high    
Version: 7.0CC: acathrow, alex.williamson, juzhang, michen, virt-maint
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-12-02 14:18:01 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
host console log
none
win2008r2 guest is using GPU device none

Description FuXiangChun 2013-12-02 10:11:19 UTC
Description of problem:
Boot RHEL7.0 and RHEL6.5 guest with Nvidia GPU device(k1), will cause host kernel panic. I attached host's console log.

Boot win2008r2 guest with Nvidia GPU device(k1), and disable the emulated VGA inside guest.  guest works well with Nvidia GPU device. I attached a screenshot.

Version-Release number of selected component (if applicable):
host and guest kernel version:
# uname -r
3.10.0-57.el7.x86_64

qemu-kvm version:
qemu-kvm-1.5.3-20.el7.x86_64

How reproducible:
100%


Steps to Reproduce:
1.add nouveau.modeset=0 option to kernel line of host, ensure host support  Nvidia GPU card

2.create xorg.conf file inside guest
#Xorg -configure :1
add BusID "PCI:6:0:0" to xorg.conf

3.unbind Nvidia device
#modprobe vfio-pci
#modprobe vfio
#modprobe vfio_iommu_type1
#echo 1 > /sys/module/vfio_iommu_type1/parameters/allow_unsafe_interrupts
#echo "10de 0ff2" > /sys/bus/pci/drivers/vfio-pci/new_id
#echo 0000:06:00.0 > /sys/bus/pci/devices/0000\:06\:00.0/driver/unbind
#echo 0000:06:00.0 > /sys/bus/pci/drivers/vfio-pci/bind

4. check unbind device 
#ls /sys/bus/pci/drivers/vfio-pci/
0000:06:00.0  0000:07:00.0  0000:08:00.0  0000:09:00.0	bind  module  new_id  remove_id  uevent  unbind

5. Boot linux guest with cli
#/usr/libexec/qemu-kvm -M pc -enable-kvm -m 2048 -smp 2,sockets=2,cores=1,threads=1 -usb -device usb-tablet,id=input0 -name gpu -uuid 990ea161-6b67-47b2-b803-19fb01d30d30 -rtc base=localtime,clock=host,driftfix=slew -drive file=guest/rhel7.qcow2,if=none,id=drive-virtio-disk,format=qcow2,cache=none,aio=native,werror=stop,rerror=stop -device virtio-blk-pci,vectors=0,bus=pci.0,addr=0x4,scsi=off,drive=drive-virtio-disk,id=virtio-disk,bootindex=1 -netdev tap,id=hostnet0,vhost=on,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,id=virtio-net-pci0,mac=00:01:02:B6:40:21,bus=pci.0,addr=0x5 -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -k en-us -boot menu=on -qmp tcp:0:4444,server,nowait -serial unix:/tmp/ttyS0,server,nowait -vnc :2 -monitor stdio -chardev socket,path=/tmp/qga.sock,server,nowait,id=qga0 -device virtio-serial -device virtserialport,chardev=qga0,name=org.qemu.guest_agent.0,id=sr0 -device vfio-pci,host=06:00.0,id=GPU-k1,addr=06.0


Actual results:
host kernel panic, I added host console log to attachment. 

Expected results:
host and guest work well.

Additional info:

Comment 1 FuXiangChun 2013-12-02 10:12:51 UTC
Created attachment 831506 [details]
host console log

Comment 2 FuXiangChun 2013-12-02 10:17:41 UTC
Created attachment 831507 [details]
win2008r2 guest is using GPU device

Comment 3 FuXiangChun 2013-12-02 10:20:07 UTC
Tested runlevel 3 and 5 for RHEL guest. hit the same issue.

Comment 4 juzhang 2013-12-02 10:26:57 UTC
Hi Alex,

QE is testing gpu device assignment feature and hit this bug. Could you have a look and confirm whether this is a real bug? If no, could you please point the steps we missed out?

Best Regards,
Junyi

Comment 6 Alex Williamson 2013-12-02 14:18:01 UTC
You need to add nouveau.modeset=0 in the guest as well as the host.  Running nouveau on this hardware in the guest hits the same problem as running nouveau on the host.

*** This bug has been marked as a duplicate of bug 1033345 ***

Comment 7 juzhang 2013-12-03 02:27:24 UTC
(In reply to Alex Williamson from comment #6)
> You need to add nouveau.modeset=0 in the guest as well as the host.  Running
> nouveau on this hardware in the guest hits the same problem as running
> nouveau on the host.
> 
> *** This bug has been marked as a duplicate of bug 1033345 ***

Problem is fixed according to your suggestion. QE will continue to test gpu device assignment feature.