Bug 1038417

Summary: QEMU core dumped when do vfio-pci PF assign with '-no-kvm' mode specified (Broadcom BCM57810 and 82576 card)
Product: Red Hat Enterprise Linux 7 Reporter: Sibiao Luo <sluo>
Component: qemu-kvmAssignee: Bandan Das <bdas>
Status: CLOSED DUPLICATE QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.0CC: acathrow, alex.williamson, chayang, hhuang, juzhang, mazhang, michen, qzhang, virt-maint, xfu
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-02-24 17:21:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 1050219    
Attachments:
Description Flags
qemu-kvm crash log(82576) none

Description Sibiao Luo 2013-12-05 05:29:34 UTC
Description of problem:
didn't add '-enable-kvm' while booting with '-no-kvm' to do vfio-pci with PF assigned, qemu will core dumped.

Version-Release number of selected component (if applicable):
host info:
3.10.0-57.el7.x86_64
qemu-kvm-1.5.3-20.el7.x86_64
seabios-1.7.2.2-4.el7.x86_64
guest info:
3.10.0-57.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1.1.Load vfio-pci, vfio, vfio_iommu_type1 modules.
# lsmod | grep vfio
vfio_pci               36474  0 
vfio_iommu_type1       17636  0 
vfio                   20777  2 vfio_iommu_type1,vfio_pci

2.Check what other devices are in the same group as PF, unbind all of them and bind to vfio-pci.
# lspci | grep -i BCM57810
08:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM57810 10 Gigabit Ethernet (rev 10)
08:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM57810 10 Gigabit Ethernet (rev 10)
# readlink /sys/bus/pci/devices/0000:08:00.0/iommu_group
../../../../kernel/iommu_groups/14
# readlink /sys/bus/pci/devices/0000:08:00.1/iommu_group
../../../../kernel/iommu_groups/14
# lspci -n -s 0000:08:00.0 | awk '{ print $3 }'
14e4:168e
# echo "14e4 168e" > /sys/bus/pci/drivers/vfio-pci/new_id
# echo 0000:08:00.0 > /sys/bus/pci/devices/0000\:08\:00.0/driver/unbind 
# echo 0000:08:00.0 > /sys/bus/pci/drivers/vfio-pci/bind
# lspci -n -s 0000:08:00.1 | awk '{ print $3 }'
14e4:168e
# echo "14e4 168e" >> /sys/bus/pci/drivers/vfio-pci/new_id
# echo 0000:08:00.1 >> /sys/bus/pci/devices/0000\:08\:00.1/driver/unbind 
# echo 0000:08:00.1 >> /sys/bus/pci/drivers/vfio-pci/bind

3.do vfio-pci with PF to guest and specified *-no-kvm*.
e.g:# /usr/libexec/qemu-kvm -M pc -cpu Opteron_G5 *-no-kvm* -m 2048 -smp 2,sockets=2,cores=1,threads=1 -no-kvm-pit-reinjection...-device vfio-pci,host=08:00.0,id=sluo_guest_nic_pf0

Actual results:
after step 3, qemu core dumped, I will paste the core dumped bt log later.
QEMU 1.5.3 monitor - type 'help' for more information
(qemu)
Segmentation fault (core dumped)

Expected results:
it should no any core dump and guest failed to boot up due to:'vfio-pci: error: requires KVM support'.

Additional info:
# /usr/libexec/qemu-kvm -M pc -cpu Opteron_G5 -no-kvm -m 2048 -smp 2,sockets=2,cores=1,threads=1 -no-kvm-pit-reinjection -usb -device usb-tablet,id=input0 -name sluo -uuid 990ea161-6b67-47b2-b803-19fb01d30d30 -rtc base=localtime,clock=host,driftfix=slew -device virtio-serial-pci,id=virtio-serial0,max_ports=16,vectors=0,bus=pci.0,addr=0x3 -chardev socket,id=channel1,path=/tmp/helloworld1,server,nowait -device virtserialport,chardev=channel1,name=com.redhat.rhevm.vdsm,bus=virtio-serial0.0,id=port1 -chardev socket,id=channel2,path=/tmp/helloworld2,server,nowait -device virtserialport,chardev=channel2,name=com.redhat.rhevm.vdsm,bus=virtio-serial0.0,id=port2 -drive file=/home/RHEL-7.0-20131127.1_Server_x86_64.qcow2,if=none,id=drive-disk,cache=none,format=qcow2,aio=native,werror=stop,rerror=stop -device virtio-blk-pci,vectors=0,bus=pci.0,addr=0x4,scsi=off,drive=drive-disk,id=system-disk,bootindex=1 -net none -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -k en-us -boot menu=on -qmp tcp:0:4444,server,nowait -serial unix:/tmp/ttyS0,server,nowait -vnc :1 -spice disable-ticketing,port=5931 -monitor stdio -device vfio-pci,host=08:00.0,id=sluo_guest_nic_pf0

Comment 1 Sibiao Luo 2013-12-05 05:32:39 UTC
Core was generated by `/usr/libexec/qemu-kvm -M pc -cpu Opteron_G5 -no-kvm -m 2048 -smp 2,sockets=2,co'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007fe522ebbb7e in qemu_set_irq (irq=0x702d6f6974726976, level=0) at hw/core/irq.c:38
38	    irq->handler(irq->opaque, irq->n, level);

(gdb) bt
#0  0x00007fe522ebbb7e in qemu_set_irq (irq=0x702d6f6974726976, level=0) at hw/core/irq.c:38
#1  0x00007fe522ff6580 in vfio_disable_intx (vdev=0x7fe524633010) at /usr/src/debug/qemu-1.5.3/hw/misc/vfio.c:564
#2  vfio_disable_interrupts (vdev=0x7fe524633010) at /usr/src/debug/qemu-1.5.3/hw/misc/vfio.c:2190
#3  0x00007fe522ff8bfd in vfio_pci_pre_reset (vdev=vdev@entry=0x7fe524633010)
    at /usr/src/debug/qemu-1.5.3/hw/misc/vfio.c:2777
#4  0x00007fe522ff92dd in vfio_pci_reset (dev=0x7fe524633010) at /usr/src/debug/qemu-1.5.3/hw/misc/vfio.c:3721
#5  0x00007fe522ec3a19 in qdev_reset_one (dev=dev@entry=0x7fe524633010, opaque=opaque@entry=0x0) at hw/core/qdev.c:227
#6  0x00007fe522ec3110 in qdev_walk_children (dev=dev@entry=0x7fe524633010, 
    devfn=devfn@entry=0x7fe522ec3a00 <qdev_reset_one>, busfn=busfn@entry=0x7fe522ec1a00 <qbus_reset_one>, 
    opaque=opaque@entry=0x0) at hw/core/qdev.c:376
#7  0x00007fe522ec31ad in qdev_reset_all (dev=dev@entry=0x7fe524633010) at hw/core/qdev.c:243
#8  0x00007fe522f047ed in pci_device_reset (dev=0x7fe524633010) at hw/pci/pci.c:180
#9  0x00007fe522f049a2 in pci_bus_reset (bus=0x7fe52459b710) at hw/pci/pci.c:226
#10 0x00007fe522f049e9 in pcibus_reset (qbus=<optimized out>) at hw/pci/pci.c:233
#11 0x00007fe522ec31f0 in qbus_walk_children (bus=bus@entry=0x7fe52459b710, 
    devfn=devfn@entry=0x7fe522ec3a00 <qdev_reset_one>, busfn=busfn@entry=0x7fe522ec1a00 <qbus_reset_one>, 
    opaque=opaque@entry=0x0) at hw/core/qdev.c:353
#12 0x00007fe522ec313a in qdev_walk_children (dev=<optimized out>, devfn=devfn@entry=0x7fe522ec3a00 <qdev_reset_one>, 
    busfn=busfn@entry=0x7fe522ec1a00 <qbus_reset_one>, opaque=opaque@entry=0x0) at hw/core/qdev.c:383
#13 0x00007fe522ec321a in qbus_walk_children (bus=<optimized out>, devfn=0x7fe522ec3a00 <qdev_reset_one>, 
    busfn=0x7fe522ec1a00 <qbus_reset_one>, opaque=0x0) at hw/core/qdev.c:360
#14 0x00007fe522fb28fd in qemu_devices_reset () at vl.c:1809
#15 qemu_system_reset (report=report@entry=false) at vl.c:1818
#16 0x00007fe522e4b37b in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4321
(gdb) bt full
#0  0x00007fe522ebbb7e in qemu_set_irq (irq=0x702d6f6974726976, level=0) at hw/core/irq.c:38
No locals.
#1  0x00007fe522ff6580 in vfio_disable_intx (vdev=0x7fe524633010) at /usr/src/debug/qemu-1.5.3/hw/misc/vfio.c:564
        fd = <optimized out>
#2  vfio_disable_interrupts (vdev=0x7fe524633010) at /usr/src/debug/qemu-1.5.3/hw/misc/vfio.c:2190
No locals.
#3  0x00007fe522ff8bfd in vfio_pci_pre_reset (vdev=vdev@entry=0x7fe524633010)
    at /usr/src/debug/qemu-1.5.3/hw/misc/vfio.c:2777
        pdev = 0x7fe524633010
        cmd = <optimized out>
#4  0x00007fe522ff92dd in vfio_pci_reset (dev=0x7fe524633010) at /usr/src/debug/qemu-1.5.3/hw/misc/vfio.c:3721
        pdev = 0x7fe524633010
        vdev = 0x7fe524633010
#5  0x00007fe522ec3a19 in qdev_reset_one (dev=dev@entry=0x7fe524633010, opaque=opaque@entry=0x0) at hw/core/qdev.c:227
No locals.
#6  0x00007fe522ec3110 in qdev_walk_children (dev=dev@entry=0x7fe524633010, 
    devfn=devfn@entry=0x7fe522ec3a00 <qdev_reset_one>, busfn=busfn@entry=0x7fe522ec1a00 <qbus_reset_one>, 
    opaque=opaque@entry=0x0) at hw/core/qdev.c:376
        bus = <optimized out>
        err = <optimized out>
#7  0x00007fe522ec31ad in qdev_reset_all (dev=dev@entry=0x7fe524633010) at hw/core/qdev.c:243
No locals.
#8  0x00007fe522f047ed in pci_device_reset (dev=0x7fe524633010) at hw/pci/pci.c:180
        r = <optimized out>
#9  0x00007fe522f049a2 in pci_bus_reset (bus=0x7fe52459b710) at hw/pci/pci.c:226
        i = <optimized out>
#10 0x00007fe522f049e9 in pcibus_reset (qbus=<optimized out>) at hw/pci/pci.c:233
No locals.
#11 0x00007fe522ec31f0 in qbus_walk_children (bus=bus@entry=0x7fe52459b710, 
    devfn=devfn@entry=0x7fe522ec3a00 <qdev_reset_one>, busfn=busfn@entry=0x7fe522ec1a00 <qbus_reset_one>, 
    opaque=opaque@entry=0x0) at hw/core/qdev.c:353
        kid = <optimized out>
        err = <optimized out>
#12 0x00007fe522ec313a in qdev_walk_children (dev=<optimized out>, devfn=devfn@entry=0x7fe522ec3a00 <qdev_reset_one>, 
    busfn=busfn@entry=0x7fe522ec1a00 <qbus_reset_one>, opaque=opaque@entry=0x0) at hw/core/qdev.c:383
        bus = 0x7fe52459b710
        err = <optimized out>
#13 0x00007fe522ec321a in qbus_walk_children (bus=<optimized out>, devfn=0x7fe522ec3a00 <qdev_reset_one>, 
    busfn=0x7fe522ec1a00 <qbus_reset_one>, opaque=0x0) at hw/core/qdev.c:360
        kid = 0x7fe52459b650
        err = <optimized out>
#14 0x00007fe522fb28fd in qemu_devices_reset () at vl.c:1809
        re = <optimized out>
        nre = 0x0
#15 qemu_system_reset (report=report@entry=false) at vl.c:1818
No locals.
#16 0x00007fe522e4b37b in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4321
        i = <optimized out>
        snapshot = 0
        linux_boot = 0
        icount_option = 0x0
        initrd_filename = 0x0
        kernel_filename = 0x0
        kernel_cmdline = 0x7fe5231343e0 ""
        boot_order = 0x7fe5230ef426 "cad"
        ds = <optimized out>
        cyls = 0
        heads = 0
        secs = 0
        translation = 0
        hda_opts = <optimized out>

        opts = 0x7fe5243f58e0
        machine_opts = <optimized out>
        olist = <optimized out>
        optind = 56
        optarg = 0x7fff384cd7d4 "vfio-pci,host=08:00.0,id=sluo_guest_nic_pf0"
        loadvm = 0x0
        machine = 0x7fe5234b9200 <pc_machine_rhel700>
        cpu_model = 0x7fff384cd38a "Opteron_G5"
        vga_model = 0x7fe523117c7f "cirrus"
        pid_file = 0x0
        incoming = 0x0
        show_vnc_port = 0
        defconfig = <optimized out>
        userconfig = 138
        log_mask = <optimized out>
        log_file = 0x0
        mem_trace = {malloc = 0x7fe522fb08d0 <malloc_and_trace>, realloc = 0x7fe522fb0890 <realloc_and_trace>, 
          free = 0x7fe522fb0850 <free_and_trace>, calloc = 0x0, try_malloc = 0x0, try_realloc = 0x0}
        trace_events = 0x0
        trace_file = 0x0
        __PRETTY_FUNCTION__ = "main"
        args = {machine = 0x7fe5234b9200 <pc_machine_rhel700>, ram_size = 2147483648, 
          boot_device = 0x7fe5230ef426 "cad", kernel_filename = 0x0, kernel_cmdline = 0x7fe5231343e0 "", 
          initrd_filename = 0x0, cpu_model = 0x7fff384cd38a "Opteron_G5"}
(gdb)

Comment 2 mazhang 2013-12-24 06:17:06 UTC
Got the same problem with 82576 nic.

Host:
qemu-kvm-rhev-1.5.3-21.el7.x86_64
kernel-3.10.0-61.el7.x86_64

CLI:
gdb --args /usr/libexec/qemu-kvm \
-M pc \
-cpu Opteron_G1 \
-m 4G \
-smp 4,sockets=2,cores=2,threads=1,maxcpus=16 \
-no-kvm \
-name rhel7-64 \
-uuid 990ea161-6b67-47b2-b803-19fb01d30d12 \
-smbios type=1,manufacturer='Red Hat',product='RHEV Hypervisor',version=el6,serial=koTUXQrb,uuid=feebc8fd-f8b0-4e75-abc3-e63fcdb67170 \
-k en-us \
-rtc base=localtime,clock=host,driftfix=slew \
-nodefaults \
-monitor stdio \
-qmp tcp:0:6666,server,nowait \
-boot menu=on \
-bios /usr/share/seabios/bios.bin \
-vga std \
-vnc :0 \
-drive file=/home/rhel7-64.raw,if=none,id=drive-virtio-disk0,format=raw,cache=none,werror=stop,rerror=stop,aio=threads \
-device virtio-blk-pci,scsi=off,bus=pci.0,drive=drive-virtio-disk0,id=virtio-disk0 \
-chardev socket,id=seabioslog,path=/tmp/seabios,server,nowait \
-device isa-debugcon,chardev=seabioslog,iobase=0x402 \
-device vfio-pci,id=pf,host=01:10.1 \

Result:
Qemu-kvm crash.

Comment 3 mazhang 2013-12-24 06:18:25 UTC
Created attachment 841083 [details]
qemu-kvm crash log(82576)

Comment 4 Bandan Das 2014-02-24 17:08:33 UTC
*** Bug 1037947 has been marked as a duplicate of this bug. ***

Comment 5 Bandan Das 2014-02-24 17:21:13 UTC
This is very likely a side effect of a inoperable card due to the rom behavior as mentioned in bug 1037956. A fix for it is being reviewed upstream.

*** This bug has been marked as a duplicate of bug 1037956 ***