Bug 1036577 - host kernel panic when assigning Nvidia GPU device to linux guest
Summary: host kernel panic when assigning Nvidia GPU device to linux guest
Keywords:
Status: CLOSED DUPLICATE of bug 1033345
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm
Version: 7.0
Hardware: x86_64
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Virtualization Maintenance
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-12-02 10:11 UTC by FuXiangChun
Modified: 2013-12-03 02:27 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-12-02 14:18:01 UTC
Target Upstream Version:


Attachments (Terms of Use)
host console log (41.29 KB, text/plain)
2013-12-02 10:12 UTC, FuXiangChun
no flags Details
win2008r2 guest is using GPU device (22.20 KB, image/png)
2013-12-02 10:17 UTC, FuXiangChun
no flags Details

Description FuXiangChun 2013-12-02 10:11:19 UTC
Description of problem:
Boot RHEL7.0 and RHEL6.5 guest with Nvidia GPU device(k1), will cause host kernel panic. I attached host's console log.

Boot win2008r2 guest with Nvidia GPU device(k1), and disable the emulated VGA inside guest.  guest works well with Nvidia GPU device. I attached a screenshot.

Version-Release number of selected component (if applicable):
host and guest kernel version:
# uname -r
3.10.0-57.el7.x86_64

qemu-kvm version:
qemu-kvm-1.5.3-20.el7.x86_64

How reproducible:
100%


Steps to Reproduce:
1.add nouveau.modeset=0 option to kernel line of host, ensure host support  Nvidia GPU card

2.create xorg.conf file inside guest
#Xorg -configure :1
add BusID "PCI:6:0:0" to xorg.conf

3.unbind Nvidia device
#modprobe vfio-pci
#modprobe vfio
#modprobe vfio_iommu_type1
#echo 1 > /sys/module/vfio_iommu_type1/parameters/allow_unsafe_interrupts
#echo "10de 0ff2" > /sys/bus/pci/drivers/vfio-pci/new_id
#echo 0000:06:00.0 > /sys/bus/pci/devices/0000\:06\:00.0/driver/unbind
#echo 0000:06:00.0 > /sys/bus/pci/drivers/vfio-pci/bind

4. check unbind device 
#ls /sys/bus/pci/drivers/vfio-pci/
0000:06:00.0  0000:07:00.0  0000:08:00.0  0000:09:00.0	bind  module  new_id  remove_id  uevent  unbind

5. Boot linux guest with cli
#/usr/libexec/qemu-kvm -M pc -enable-kvm -m 2048 -smp 2,sockets=2,cores=1,threads=1 -usb -device usb-tablet,id=input0 -name gpu -uuid 990ea161-6b67-47b2-b803-19fb01d30d30 -rtc base=localtime,clock=host,driftfix=slew -drive file=guest/rhel7.qcow2,if=none,id=drive-virtio-disk,format=qcow2,cache=none,aio=native,werror=stop,rerror=stop -device virtio-blk-pci,vectors=0,bus=pci.0,addr=0x4,scsi=off,drive=drive-virtio-disk,id=virtio-disk,bootindex=1 -netdev tap,id=hostnet0,vhost=on,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,id=virtio-net-pci0,mac=00:01:02:B6:40:21,bus=pci.0,addr=0x5 -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -k en-us -boot menu=on -qmp tcp:0:4444,server,nowait -serial unix:/tmp/ttyS0,server,nowait -vnc :2 -monitor stdio -chardev socket,path=/tmp/qga.sock,server,nowait,id=qga0 -device virtio-serial -device virtserialport,chardev=qga0,name=org.qemu.guest_agent.0,id=sr0 -device vfio-pci,host=06:00.0,id=GPU-k1,addr=06.0


Actual results:
host kernel panic, I added host console log to attachment. 

Expected results:
host and guest work well.

Additional info:

Comment 1 FuXiangChun 2013-12-02 10:12:51 UTC
Created attachment 831506 [details]
host console log

Comment 2 FuXiangChun 2013-12-02 10:17:41 UTC
Created attachment 831507 [details]
win2008r2 guest is using GPU device

Comment 3 FuXiangChun 2013-12-02 10:20:07 UTC
Tested runlevel 3 and 5 for RHEL guest. hit the same issue.

Comment 4 juzhang 2013-12-02 10:26:57 UTC
Hi Alex,

QE is testing gpu device assignment feature and hit this bug. Could you have a look and confirm whether this is a real bug? If no, could you please point the steps we missed out?

Best Regards,
Junyi

Comment 6 Alex Williamson 2013-12-02 14:18:01 UTC
You need to add nouveau.modeset=0 in the guest as well as the host.  Running nouveau on this hardware in the guest hits the same problem as running nouveau on the host.

*** This bug has been marked as a duplicate of bug 1033345 ***

Comment 7 juzhang 2013-12-03 02:27:24 UTC
(In reply to Alex Williamson from comment #6)
> You need to add nouveau.modeset=0 in the guest as well as the host.  Running
> nouveau on this hardware in the guest hits the same problem as running
> nouveau on the host.
> 
> *** This bug has been marked as a duplicate of bug 1033345 ***

Problem is fixed according to your suggestion. QE will continue to test gpu device assignment feature.


Note You need to log in before you can comment on or make changes to this bug.