Bug 1441605

Summary: qemu crash when attach a hostdev device to the guest with intel-iommu device enabled
Product: Red Hat Enterprise Linux 7 Reporter: yafu <yafu>
Component: qemu-kvm-rhevAssignee: Bandan Das <bdas>
Status: CLOSED DEFERRED QA Contact: Pei Zhang <pezhang>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.4CC: ailan, alex.williamson, chayang, dyuan, jinzhao, juzhang, knoel, peterx, siliu, virt-maint, yafu, yalzhang, yfu, zpeng
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-07-22 20:29:20 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1622209    
Bug Blocks: 1473046    
Attachments:
Description Flags
VM xml file
none
vmcore-dmesg.txt on host none

Description yafu 2017-04-12 10:25:27 UTC
Description of problem:
qemu crash when attach a hostdev device to the guest with intel-iommu device enabled.

Version-Release number of selected component:
qemu-kvm-rhev-2.8.0-6.el7.x86_64
libvirt-3.2.0-2.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1.Enable iommu on the host;

2.Prepare a guest with q35 machine type and enable the intel-iommu device:
...
<os>
    <type arch='x86_64' machine='pc-q35-rhel7.4.0'>hvm</type>  -
    <boot dev='hd'/>
</os>
...
<iommu model='intel'/>
</devices>
...

3.start the guest:
#virsh start rhel7.3

4.Prepare a hostdev device xml:
#cat pci.xml
 <hostdev mode='subsystem' type='pci' managed='yes'>
      <source>
        <address domain='0x0000' bus='0x00' slot='0x19' function='0x0'/>
      </source>
    </hostdev>

5.Attach the hostdev device to the guest:
#virsh attach-device rhel7.3 pci.xml
error: Failed to attach device from VF1.xml
error: internal error: child reported: Kernel does not provide mount namespace: No such file or directory

6.Check the status of the guest:
# virsh domstate rhel7.3 --reason
shut off (crashed)


7.Check the qemu log of the guest:
...
Device at bus pci.8 addr 00.0 requires iommu notifier which is currently not supported by intel-iommu emulation
2017-04-12 03:07:26.093+0000: shutting down, reason=crashed

Actual results:
qemu crashed when attach a hostdev interface to the guest with intel-iommu device enabled


Expected results:
The qemu guest should not crash.

Comment 2 Jingjing Shao 2017-05-19 06:45:22 UTC
Update test result

libvirt-3.2.0-5.el7.x86_64
qemu-kvm-rhev-2.9.0-4.el7.x86_64


# virsh list --all
 Id    Name                           State
----------------------------------------------------
 17    q35-js                         running


# cat vf.xml 
<interface type='hostdev' managed='yes'>
<mac address='02:24:6b:89:bc:e9'/>
<source>
<address type='pci' domain='0x0000' bus='0x86' slot='0x10' function='0x1'/>
</source>
</interface>


# virsh attach-device q35-js  vf.xml 
Device attached successfully


# virsh list --all
 Id    Name                           State
----------------------------------------------------
 -     q35-js                         shut off


# virsh domstate q35-js --reason
shut off (crashed)

Comment 6 Pei Zhang 2017-11-22 09:07:44 UTC
Created attachment 1357315 [details]
VM xml file

Hi Jingjing, Yan,

Could you please share more details about your environments and xml info? As I can not reproduce this issue. Thanks.


Below is my testing:

Network cards:X540-AT2
VM XML: see attachment of this Comment.

Versions:
3.10.0-789.el7.x86_64
qemu-kvm-rhev-2.10.0-6.el7.x86_64
libvirt-3.9.0-2.el7.x86_64

Steps:
1. Boot VM with iommu, XML file is attached to this Comment.

2. Hot plug/unplug the PF network device:
# cat rhel7.5_nonrt_nic1.xml
 <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x81' slot='0x00' function='0x0'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
    </hostdev>

# virsh attach-device  rhel7.5_nonrt rhel7.5_nonrt_nic1.xml
Device attached successfully

# virsh detach-device  rhel7.5_nonrt rhel7.5_nonrt_nic1.xml
Device detached successfully


3. Repeat step2 several (20) times, both host , qemu and guest work well.

Comment 7 yafu 2017-11-24 06:35:10 UTC
(In reply to Pei Zhang from comment #6)
> Created attachment 1357315 [details]
> VM xml file
> 
> Hi Jingjing, Yan,
> 
> Could you please share more details about your environments and xml info? As
> I can not reproduce this issue. Thanks.
> 
> 
> Below is my testing:
> 
> Network cards:X540-AT2
> VM XML: see attachment of this Comment.
> 
> Versions:
> 3.10.0-789.el7.x86_64
> qemu-kvm-rhev-2.10.0-6.el7.x86_64
> libvirt-3.9.0-2.el7.x86_64
> 
> Steps:
> 1. Boot VM with iommu, XML file is attached to this Comment.
> 
> 2. Hot plug/unplug the PF network device:
> # cat rhel7.5_nonrt_nic1.xml
>  <hostdev mode='subsystem' type='pci' managed='yes'>
>       <driver name='vfio'/>
>       <source>
>         <address domain='0x0000' bus='0x81' slot='0x00' function='0x0'/>
>       </source>
>       <address type='pci' domain='0x0000' bus='0x04' slot='0x00'
> function='0x0'/>
>     </hostdev>
> 
> # virsh attach-device  rhel7.5_nonrt rhel7.5_nonrt_nic1.xml
> Device attached successfully
> 
> # virsh detach-device  rhel7.5_nonrt rhel7.5_nonrt_nic1.xml
> Device detached successfully
> 
> 
> 3. Repeat step2 several (20) times, both host , qemu and guest work well.

The guest should enable iommu before attaching a hostdev device:
(1)Edit file  "/etc/default/grub":
Append "intel_iommu=on" or "amd_iommu=on" to the value of
"GRUB_CMDLINE_LINUX=......". 
 
(2)Then run the following command to generate the updated grub file:
# grub2-mkconfig -o /boot/grub2/grub.cfg

(3) Reboot the guest.

Comment 8 yalzhang@redhat.com 2018-07-03 10:59:47 UTC
Created attachment 1456199 [details]
vmcore-dmesg.txt on host

I just encountered this failure, when start such a guest with hostdev interface(VF) or hostdev device(PF) will also crash the host, and it 100% happen with the configuration as below:

1. On host, configure L1 guest as below:
# virsh dumpxml q
....
 <os>
    <type arch='x86_64' machine='pc-q35-rhel7.5.0'>hvm</type>
    <loader readonly='yes' secure='yes' type='pflash'>/usr/share/OVMF/OVMF_CODE.secboot.fd</loader>
    <nvram>/var/lib/libvirt/qemu/nvram/q_VARS.fd</nvram>
  </os>
  <features>
    <acpi/>
    <apic/>
    <vmport state='off'/>
    <smm state='on'/>
    <ioapic driver='qemu'/>
  </features>
  <cpu mode='host-passthrough' check='none'>
    <feature policy='require' name='vmx'/>
  </cpu>
<devices>
...

  <iommu model='intel'>
      <driver intremap='on' caching_mode='on' iotlb='on'/>
    </iommu>
...
</devices>
...

2. Start the L1 guest, and enable "intel_iommu=on" in guest kernel cmdline. append "intel_iommu=on" to /etc/default/grub
# grub2-mkconfig -o /boot/efi/EFI/redhat/grub.cfg

3. Destroy the L1 guest, attach one hostdev interface, start the guest, host will crash
# virsh dumpxml q | grep /interface -B6
    <interface type='hostdev' managed='yes'>
      <mac address='52:54:00:61:31:46'/>
      <source>
        <address type='pci' domain='0x0000' bus='0x05' slot='0x02' function='0x0'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
    </interface>

# virsh start q 
Domain q started

host will crash here, the kdump file vmcore-dmesg.txt attached.

Comment 9 yalzhang@redhat.com 2018-07-03 11:29:54 UTC
version for comment 8
# rpm -q libvirt qemu-kvm-rhev kernel
libvirt-4.4.0-2.el7.x86_64
qemu-kvm-rhev-2.12.0-6.el7.x86_64
kernel-3.10.0-907.el7.x86_64

Comment 10 Bandan Das 2018-08-07 19:25:13 UTC
(In reply to yalzhang from comment #8)
> Created attachment 1456199 [details]
> vmcore-dmesg.txt on host
> 
> I just encountered this failure, when start such a guest with hostdev
> interface(VF) or hostdev device(PF) will also crash the host, and it 100%
> happen with the configuration as below:
> 
Can you please post the qemu cmd line as well ?

Comment 11 yalzhang@redhat.com 2018-08-10 07:01:15 UTC
Hi Bandan, I have tried several times but only encountered the issue in comment 8 once, I couldn't recall what is special, but the issue in comment 0 is 100% reproducible, and I found it related with the caching_mode='on' option.

1. Start vm with iommu without caching_mode='on' together with hostdev device
# virsh dumpxml domain2_ovmf
...
<iommu model='intel'/>
...
<interface type='hostdev' managed='yes'>
      <mac address='52:54:00:63:f3:41'/>
      <source>
        <address type='pci' domain='0x0000' bus='0x03' slot='0x10' function='0x4'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x07' slot='0x00' function='0x0'/>
    </interface>
...
# virsh start domain2_ovmf
virsh Domain domain2_ovmf started

# virsh list 
 Id    Name                           State
----------------------------------------------------
 1     domain2_ovmf                   running

2. after a while, the domain is crashed
# virsh domstate domain2_ovmf --reason
shut off (crashed)

check the log:
# cat  /var/log/libvirt/qemu/domain2_ovmf.log
...
2018-08-10T06:34:39.748618Z qemu-kvm: We need to set caching-mode=1 for intel-iommu to enable device assignment with IOMMU protection.
2018-08-10 06:34:39.994+0000: shutting down, reason=crashed

# grep error /var/log/libvirt/libvirtd.log
2018-08-10 06:34:39.792+0000: 4967: error : qemuMonitorIO:718 : internal error: End of file from qemu monitor

Expected results:
libvirt should report error when we hotplug a hostdev device while there is iommu device without caching_mode=on; and libvirt should refuse to start a guest with hostdev device and iommu device lock caching_mode=on

Comment 12 Bandan Das 2018-08-10 16:56:43 UTC
Thank you. I will try to reproduce comment 0 then. Can you please post your qemu command line ? :) By command line, I mean the qemu invocation corresponding to your XML. You can just grep for the qemu process when you have the guest running.

I am not running with a XML but I am not able to reproduce this. I think taking a look at your qemu cmd line will help me figure out what I am doing different.

Comment 13 yalzhang@redhat.com 2018-08-13 01:38:52 UTC
Sorry, the qemu cmd as below:

# virsh start domain3_sea ; ps aux | grep domain3_sea | grep -v grep|sed 's/-device/\n-device/g' ; sleep 20; virsh domstate domain3_sea --reason
Domain domain3_sea started

qemu     31554 64.0 14.1 1913984 1139304 ?     SLl  09:35   0:00 /usr/libexec/qemu-kvm -name guest=domain3_sea,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-14-domain3_sea/master-key.aes -machine pc-q35-rhel7.6.0,accel=kvm,usb=off,vmport=off,dump-guest-core=off,kernel_irqchip=split -cpu Nehalem-IBRS -m 1024 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid d2b26621-7d34-4c98-a406-31b93e5cb0a8 -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=27,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -boot strict=on 
-device intel-iommu 
-device pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2 
-device i82801b11-bridge,id=pci.2,bus=pcie.0,addr=0x1e 
-device pci-bridge,chassis_nr=3,id=pci.3,bus=pci.2,addr=0x0 
-device pcie-root-port,port=0x11,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x1 
-device pcie-root-port,port=0x12,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x2 
-device pcie-root-port,port=0x13,chassis=6,id=pci.6,bus=pcie.0,addr=0x2.0x3 
-device pcie-root-port,port=0x14,chassis=7,id=pci.7,bus=pcie.0,addr=0x2.0x4 
-device ich9-usb-ehci1,id=usb,bus=pcie.0,addr=0x1d.0x7 
-device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pcie.0,multifunction=on,addr=0x1d 
-device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pcie.0,addr=0x1d.0x1 
-device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pcie.0,addr=0x1d.0x2 
-device virtio-serial-pci,id=virtio-serial0,bus=pci.4,addr=0x0 -drive file=/var/lib/libvirt/images/RHEL-7.6-x86_64-latest.qcow2.1,format=qcow2,if=none,id=drive-virtio-disk0 
-device virtio-blk-pci,scsi=off,bus=pci.5,addr=0x0,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -chardev pty,id=charserial0 
-device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,fd=29,server,nowait 
-device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel1,name=vdagent 
-device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 
-device usb-tablet,id=input0,bus=usb.0,port=1 -spice port=5900,addr=127.0.0.1,disable-ticketing,seamless-migration=on 
-device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pcie.0,addr=0x1 
-device intel-hda,id=sound0,bus=pci.3,addr=0x1 
-device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev spicevmc,id=charredir0,name=usbredir 
-device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 -chardev spicevmc,id=charredir1,name=usbredir 
-device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 
-device vfio-pci,host=03:10.6,id=hostdev0,bus=pci.1,addr=0x0 
-device virtio-balloon-pci,id=balloon0,bus=pci.6,addr=0x0 -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on
shut off (crashed)

Comment 15 yalzhang@redhat.com 2018-08-20 02:18:31 UTC
Hi Bandan, I have sent it, please check the mail. Let me know if you can not access it.