Bug 1665244

Summary: libvirt segfaults with vfio-pci hostdev device
Product: Red Hat Enterprise Linux 8 Reporter: Alex Williamson <alex.williamson>
Component: libvirtAssignee: John Ferlan <jferlan>
Status: CLOSED CURRENTRELEASE QA Contact: yalzhang <yalzhang>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 8.0CC: alex.williamson, blc, chhu, jdenemar, jferlan, knoel, mtessun, nanliu, rbalakri, virt-bugs, xuzhang, yafu, yalzhang
Target Milestone: rcKeywords: Regression
Target Release: 8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-4.5.0-18.el8 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1665474 (view as bug list) Environment:
Last Closed: 2019-06-14 01:30:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1665474    

Description Alex Williamson 2019-01-10 19:02:56 UTC
Description of problem:

With a PCI assigned device included in domain XML, libvirt segfaults.  For example:

    <hostdev mode='subsystem' type='pci' managed='yes'>
      <source>
        <address domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x06' slot='0x00' function='0x0'/>
    </hostdev>


Version-Release number of selected component (if applicable):
libvirt-4.5.0-17.module+el8+2625+db702f9d.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Launch VM with vfio-pci assigned device
2.
3.

Actual results:
segfault

Expected results:
operational

Additional info:

Thread 4 "libvirtd" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f8245695700 (LWP 6556)]
0x00007f8250008646 in virSCSIDeviceGetAdapterId (
    adapter=0x200000000 <error: Cannot access memory at address 0x200000000>, 
    adapter_id=0x7f8245694474) at util/virscsi.c:100
100	    if (STRPREFIX(adapter, "scsi_host") &&
(gdb) bt
#0  0x00007f8250008646 in virSCSIDeviceGetAdapterId (
    adapter=0x200000000 <error: Cannot access memory at address 0x200000000>, 
    adapter_id=0x7f8245694474) at util/virscsi.c:100
#1  0x00007f82500088d2 in virSCSIDeviceGetDevName (
    sysfs_prefix=sysfs_prefix@entry=0x0, adapter=<optimized out>, bus=0, 
    target=0, unit=0) at util/virscsi.c:163
#2  0x00007f82077d933b in qemuGetHostdevPath (
    hostdev=hostdev@entry=0x7f81e8322640) at qemu/qemu_conf.c:1454
#3  0x00007f82077dcdfe in qemuSetUnprivSGIO (dev=dev@entry=0x7f8245694570)
    at qemu/qemu_conf.c:1670
#4  0x00007f82077cae08 in qemuHostdevPrepareSCSIDevices (
    driver=driver@entry=0x7f81e8106c60, name=0x7f81e8321f60 "fedora29", 
    hostdevs=0x7f81e8322290, nhostdevs=1) at qemu/qemu_hostdev.c:287
#5  0x00007f82077cb015 in qemuHostdevPrepareDomainDevices (
    driver=driver@entry=0x7f81e8106c60, def=0x7f81e83218a0, 
    qemuCaps=<optimized out>, flags=3) at qemu/qemu_hostdev.c:356
#6  0x00007f82077e50d5 in qemuProcessPrepareHost (
    driver=driver@entry=0x7f81e8106c60, vm=vm@entry=0x7f81e8323af0, 
    flags=flags@entry=17) at qemu/qemu_process.c:6151
#7  0x00007f82077eb51f in qemuProcessStart (conn=conn@entry=0x7f8230001120, 
    driver=driver@entry=0x7f81e8106c60, vm=vm@entry=0x7f81e8323af0, 
    updatedCPU=updatedCPU@entry=0x0, 
    asyncJob=asyncJob@entry=QEMU_ASYNC_JOB_START, 
    migrateFrom=migrateFrom@entry=0x0, migrateFd=-1, migratePath=0x0, 
    snapshot=0x0, vmop=VIR_NETDEV_VPORT_PROFILE_OP_CREATE, flags=17)
    at qemu/qemu_process.c:6730
#8  0x00007f820784aded in qemuDomainObjStart (conn=0x7f8230001120, 
    driver=driver@entry=0x7f81e8106c60, vm=0x7f81e8323af0, 
    flags=flags@entry=0, asyncJob=QEMU_ASYNC_JOB_START)
    at qemu/qemu_driver.c:7290
#9  0x00007f820784b439 in qemuDomainCreateWithFlags (dom=0x7f823c000b90, 
    flags=0) at qemu/qemu_driver.c:7343
#10 0x00007f82501b13c7 in virDomainCreate (domain=domain@entry=0x7f823c000b90)
    at libvirt-domain.c:6531
#11 0x000055cd8093f33e in remoteDispatchDomainCreate (server=0x55cd80d8eac0, 
    msg=0x55cd80df41f0, args=<optimized out>, rerr=0x7f8245694960, 
    client=0x55cd80df1990) at remote/remote_daemon_dispatch_stubs.h:4434
#12 remoteDispatchDomainCreateHelper (server=0x55cd80d8eac0, 
    client=0x55cd80df1990, msg=0x55cd80df41f0, rerr=0x7f8245694960, 
    args=<optimized out>, ret=0x7f823c000b50)
    at remote/remote_daemon_dispatch_stubs.h:4410
#13 0x00007f82500e9074 in virNetServerProgramDispatchCall (msg=0x55cd80df41f0, 
    client=0x55cd80df1990, server=0x55cd80d8eac0, prog=0x55cd80df8030)
    at rpc/virnetserverprogram.c:437
#14 virNetServerProgramDispatch (prog=0x55cd80df8030, 
    server=server@entry=0x55cd80d8eac0, client=0x55cd80df1990, 
    msg=0x55cd80df41f0) at rpc/virnetserverprogram.c:304
#15 0x00007f82500ef54c in virNetServerProcessMsg (msg=<optimized out>, 
    prog=<optimized out>, client=<optimized out>, srv=0x55cd80d8eac0)
    at rpc/virnetserver.c:143
#16 virNetServerHandleJob (jobOpaque=<optimized out>, opaque=0x55cd80d8eac0)
    at rpc/virnetserver.c:164
#17 0x00007f825001f4c0 in virThreadPoolWorker (
    opaque=opaque@entry=0x55cd80d99000) at util/virthreadpool.c:167
#18 0x00007f825001e7cc in virThreadHelper (data=<optimized out>)
    at util/virthread.c:206
#19 0x00007f824ce992de in start_thread () from /lib64/libpthread.so.0
#20 0x00007f824cbc9a63 in clone () from /lib64/libc.so.6

Bisecting brew builds this was introduced in 17.module+el8+2625+db702f9d

Comment 4 John Ferlan 2019-01-10 23:41:20 UTC
Patches posted upstream to avoid non SCSI hostdevs during qemuHostdevPrepareSCSIDevices processing that would end up calling qemuSetUnprivSGIO:

https://www.redhat.com/archives/libvir-list/2019-January/msg00292.html

In particular patch1 of the series resolves the issue.

Comment 6 yalzhang@redhat.com 2019-01-11 06:06:45 UTC
I can reproduce the libvirtd crash:
# rpm -q libvirt
libvirt-4.5.0-17.module+el8+2625+db702f9d.x86_64

# pidof libvirtd
13717

# virsh dumpxml rhel1 | grep /hostdev -B5
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <source>
        <address domain='0x0000' bus='0x82' slot='0x00' function='0x0'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x07' slot='0x00' function='0x0'/>
    </hostdev>

# virsh start rhel1
error: Disconnected from qemu:///system due to end of file
error: Failed to start domain rhel1
error: End of file while reading data: Input/output error

# pidof libvirtd
13836

Comment 8 Jiri Denemark 2019-01-11 11:45:24 UTC
*** Bug 1665416 has been marked as a duplicate of this bug. ***

Comment 10 John Ferlan 2019-01-11 14:12:11 UTC
Upstream patch was pushed:

commit f30ac207ad96a567ade0d8a49023ade9233b2b72
Author: John Ferlan <jferlan>
Date:   Thu Jan 10 18:05:12 2019 -0500

    qemu: Filter non SCSI hostdevs in qemuHostdevPrepareSCSIDevices
    
    When commit 1d94b3e7 added code to walk the [n]hostdevs list looking
    to add shared hostdevs, it should've filtered any hostdevs that were
    not SCSI hostdev's.
    
    Signed-off-by: John Ferlan <jferlan>
    Reviewed-by: Ján Tomko <jtomko>

$ git describe f30ac207ad96a567ade0d8a49023ade9233b2b72
v5.0.0-rc1-1-gf30ac207ad
$

Comment 12 Brendan Conoboy 2019-01-14 17:58:32 UTC
This has been reviewed by the leads out of band.  We are marking this as a blocker for 8.0.0 release.

Comment 14 yalzhang@redhat.com 2019-01-17 12:07:51 UTC
verify this bug on libvirt-4.5.0-18.module+el8+2691+dc742e5d.x86_64

1.Start guest with hostdev device
# virsh dumpxml rhel1 | grep /hostdev -B5
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <source>
        <address domain='0x0000' bus='0x82' slot='0x10' function='0x5'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x07' slot='0x00' function='0x0'/>
    </hostdev>
# pidof libvirtd; virsh start rhel1; pidof libvirtd
30020
Domain rhel1 started

30020

2. start guest with mdev device successfully
# virsh dumpxml q35 | grep /hostdev -B5
    <hostdev mode='subsystem' type='mdev' managed='no' model='vfio-pci' display='on'>
      <source>
        <address uuid='1725be71-3eec-45d5-a19e-c2e9a696faf6'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x06' slot='0x00' function='0x0'/>
    </hostdev>
# pidof libvirtd; virsh start q35;pidof libvirtd
16696
Domain q35 started

16696