Bug 1046337

Summary: running domain with PCI passthrough device is "lost" during libvirtd restart
Product: Red Hat Enterprise Linux 7 Reporter: Laine Stump <laine>
Component: libvirtAssignee: Laine Stump <laine>
Status: CLOSED CURRENTRELEASE QA Contact: Virtualization Bugs <virt-bugs>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.0CC: acathrow, dyuan, honzhang, jmiao, mzhan, xuzhang
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-1.1.1-18.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-06-13 10:42:56 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Laine Stump 2013-12-24 15:41:26 UTC
If there is a running domain with a PCI passthrough device that was assigned using:

   <interface type='hostdev'>

and the device had been defined with "<model type='virtio'/> (which is meaningless, but allowed), then when libvirtd is restarted, it will fail while loading the running domains' status:

virDomainNetDefParseXML:6763 : internal error: Unknown interface <driver name='vfio'> has been specified

and the domain will thus not be added to the list of running domains (although the qemu process is still running, and the domain is still reachable via its spice/vnc graphics device and/or network connection).

The <driver> type attribute of an interface is interpreted in two different ways depending on the <interface> type - if the interface is type='hostdev', then then driver name describes which backend to use for the hostdev device assignment (vfio or kvm), but if the interface is any emulated type *and* the model type is "virtio", then the driver name can be "vhost" or "qemu", telling which backend qemu should use to communicate with the emulated device.

The problem comes when someone has defined a an interface like this (which is accepted by the parser as long as no <driver name='xxx'/> is specified):

    <interface type='hostdev'>
       ...
       <model type='virtio'/>
       ...
    </interface>

As libvirt storing this definition in the domain's status, the driver name is automatically filled in with the backend that was automatically decided by libvirt, so it stores this in the status:

    <interface type='hostdev'>
       ...
       <driver name='vfio'/>
       ...
       <model type='virtio'/>
       ...
    </interface>

This isn't noticed until the next time libvirtd is restarted - as it is reading the status of all domains,  it encounters the above interface definition, logs an error, and fails to reload the domain status, so the domain is marked as inactive.

(Until recently, the driver name wasn't automatically filled in if not specified, which is why this bug didn't show up earlier).

Comment 3 Jincheng Miao 2014-01-10 08:31:21 UTC
I can reproduce it in libvirt-1.1.1-17.el7:

add interface with <model type='virtio'/>:
    <interface type='hostdev' managed='yes'>
      <mac address='52:54:00:a5:e7:f6'/>
      <source>
        <address type='pci' domain='0x0000' bus='0x03' slot='0x10' function='0x0'/>
      </source>
      <model type='virtio'/>
    </interface>

# virsh start r7
Domain r7 started

# virsh list
 Id    Name                           State
----------------------------------------------------
 2     r7                             running

# systemctl restart libvirtd.service

# virsh list
 Id    Name                           State
----------------------------------------------------

# ps -ef | grep qemu
qemu      5930     1 57 20:14 ?        00:00:17 /usr/libexec/qemu-kvm -name r7 -S -machine pc-i440fx-rhel7.0.0,accel=kvm,usb=off -m 1024 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid c80b0546-3b9b-46c7-bba8-ca98693d1f1d -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/r7.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x3 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive file=/var/lib/libvirt/images/r7.img,if=none,id=drive-virtio-disk0,format=raw,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x8,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev spicevmc,id=charchannel0,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 -spice port=5900,addr=127.0.0.1,disable-ticketing,seamless-migration=on -vga qxl -global qxl-vga.ram_size=67108864 -global qxl-vga.vram_size=67108864 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -device vfio-pci,host=03:10.0,id=hostdev0,bus=pci.0,addr=0x7 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6
root      6143  4431  0 20:14 pts/5    00:00:00 grep --color=auto qemu

As we can see, the qemu process is running, but libvirt can't list it.

In latest libvirt-1.1.1-18.el7:
# virsh start r7
Domain r7 started

# systemctl restart libvirtd.service

# virsh list
 Id    Name                           State
----------------------------------------------------
 10    r7                             running

The domain can be listed, this means the bug is fixed, and I change the status to VERIFIED

Comment 4 Ludek Smid 2014-06-13 10:42:56 UTC
This request was resolved in Red Hat Enterprise Linux 7.0.

Contact your manager or support representative in case you have further questions about the request.