RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1441605 - qemu crash when attach a hostdev device to the guest with intel-iommu device enabled
Summary: qemu crash when attach a hostdev device to the guest with intel-iommu device...
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev
Version: 7.4
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Bandan Das
QA Contact: Pei Zhang
URL:
Whiteboard:
Depends On: 1622209
Blocks: 1473046
TreeView+ depends on / blocked
 
Reported: 2017-04-12 10:25 UTC by yafu
Modified: 2019-07-22 20:29 UTC (History)
14 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-07-22 20:29:20 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
VM xml file (4.56 KB, text/html)
2017-11-22 09:07 UTC, Pei Zhang
no flags Details
vmcore-dmesg.txt on host (148.90 KB, text/plain)
2018-07-03 10:59 UTC, yalzhang@redhat.com
no flags Details

Description yafu 2017-04-12 10:25:27 UTC
Description of problem:
qemu crash when attach a hostdev device to the guest with intel-iommu device enabled.

Version-Release number of selected component:
qemu-kvm-rhev-2.8.0-6.el7.x86_64
libvirt-3.2.0-2.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1.Enable iommu on the host;

2.Prepare a guest with q35 machine type and enable the intel-iommu device:
...
<os>
    <type arch='x86_64' machine='pc-q35-rhel7.4.0'>hvm</type>  -
    <boot dev='hd'/>
</os>
...
<iommu model='intel'/>
</devices>
...

3.start the guest:
#virsh start rhel7.3

4.Prepare a hostdev device xml:
#cat pci.xml
 <hostdev mode='subsystem' type='pci' managed='yes'>
      <source>
        <address domain='0x0000' bus='0x00' slot='0x19' function='0x0'/>
      </source>
    </hostdev>

5.Attach the hostdev device to the guest:
#virsh attach-device rhel7.3 pci.xml
error: Failed to attach device from VF1.xml
error: internal error: child reported: Kernel does not provide mount namespace: No such file or directory

6.Check the status of the guest:
# virsh domstate rhel7.3 --reason
shut off (crashed)


7.Check the qemu log of the guest:
...
Device at bus pci.8 addr 00.0 requires iommu notifier which is currently not supported by intel-iommu emulation
2017-04-12 03:07:26.093+0000: shutting down, reason=crashed

Actual results:
qemu crashed when attach a hostdev interface to the guest with intel-iommu device enabled


Expected results:
The qemu guest should not crash.

Comment 2 Jingjing Shao 2017-05-19 06:45:22 UTC
Update test result

libvirt-3.2.0-5.el7.x86_64
qemu-kvm-rhev-2.9.0-4.el7.x86_64


# virsh list --all
 Id    Name                           State
----------------------------------------------------
 17    q35-js                         running


# cat vf.xml 
<interface type='hostdev' managed='yes'>
<mac address='02:24:6b:89:bc:e9'/>
<source>
<address type='pci' domain='0x0000' bus='0x86' slot='0x10' function='0x1'/>
</source>
</interface>


# virsh attach-device q35-js  vf.xml 
Device attached successfully


# virsh list --all
 Id    Name                           State
----------------------------------------------------
 -     q35-js                         shut off


# virsh domstate q35-js --reason
shut off (crashed)

Comment 6 Pei Zhang 2017-11-22 09:07:44 UTC
Created attachment 1357315 [details]
VM xml file

Hi Jingjing, Yan,

Could you please share more details about your environments and xml info? As I can not reproduce this issue. Thanks.


Below is my testing:

Network cards:X540-AT2
VM XML: see attachment of this Comment.

Versions:
3.10.0-789.el7.x86_64
qemu-kvm-rhev-2.10.0-6.el7.x86_64
libvirt-3.9.0-2.el7.x86_64

Steps:
1. Boot VM with iommu, XML file is attached to this Comment.

2. Hot plug/unplug the PF network device:
# cat rhel7.5_nonrt_nic1.xml
 <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x81' slot='0x00' function='0x0'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
    </hostdev>

# virsh attach-device  rhel7.5_nonrt rhel7.5_nonrt_nic1.xml
Device attached successfully

# virsh detach-device  rhel7.5_nonrt rhel7.5_nonrt_nic1.xml
Device detached successfully


3. Repeat step2 several (20) times, both host , qemu and guest work well.

Comment 7 yafu 2017-11-24 06:35:10 UTC
(In reply to Pei Zhang from comment #6)
> Created attachment 1357315 [details]
> VM xml file
> 
> Hi Jingjing, Yan,
> 
> Could you please share more details about your environments and xml info? As
> I can not reproduce this issue. Thanks.
> 
> 
> Below is my testing:
> 
> Network cards:X540-AT2
> VM XML: see attachment of this Comment.
> 
> Versions:
> 3.10.0-789.el7.x86_64
> qemu-kvm-rhev-2.10.0-6.el7.x86_64
> libvirt-3.9.0-2.el7.x86_64
> 
> Steps:
> 1. Boot VM with iommu, XML file is attached to this Comment.
> 
> 2. Hot plug/unplug the PF network device:
> # cat rhel7.5_nonrt_nic1.xml
>  <hostdev mode='subsystem' type='pci' managed='yes'>
>       <driver name='vfio'/>
>       <source>
>         <address domain='0x0000' bus='0x81' slot='0x00' function='0x0'/>
>       </source>
>       <address type='pci' domain='0x0000' bus='0x04' slot='0x00'
> function='0x0'/>
>     </hostdev>
> 
> # virsh attach-device  rhel7.5_nonrt rhel7.5_nonrt_nic1.xml
> Device attached successfully
> 
> # virsh detach-device  rhel7.5_nonrt rhel7.5_nonrt_nic1.xml
> Device detached successfully
> 
> 
> 3. Repeat step2 several (20) times, both host , qemu and guest work well.

The guest should enable iommu before attaching a hostdev device:
(1)Edit file  "/etc/default/grub":
Append "intel_iommu=on" or "amd_iommu=on" to the value of
"GRUB_CMDLINE_LINUX=......". 
 
(2)Then run the following command to generate the updated grub file:
# grub2-mkconfig -o /boot/grub2/grub.cfg

(3) Reboot the guest.

Comment 8 yalzhang@redhat.com 2018-07-03 10:59:47 UTC
Created attachment 1456199 [details]
vmcore-dmesg.txt on host

I just encountered this failure, when start such a guest with hostdev interface(VF) or hostdev device(PF) will also crash the host, and it 100% happen with the configuration as below:

1. On host, configure L1 guest as below:
# virsh dumpxml q
....
 <os>
    <type arch='x86_64' machine='pc-q35-rhel7.5.0'>hvm</type>
    <loader readonly='yes' secure='yes' type='pflash'>/usr/share/OVMF/OVMF_CODE.secboot.fd</loader>
    <nvram>/var/lib/libvirt/qemu/nvram/q_VARS.fd</nvram>
  </os>
  <features>
    <acpi/>
    <apic/>
    <vmport state='off'/>
    <smm state='on'/>
    <ioapic driver='qemu'/>
  </features>
  <cpu mode='host-passthrough' check='none'>
    <feature policy='require' name='vmx'/>
  </cpu>
<devices>
...

  <iommu model='intel'>
      <driver intremap='on' caching_mode='on' iotlb='on'/>
    </iommu>
...
</devices>
...

2. Start the L1 guest, and enable "intel_iommu=on" in guest kernel cmdline. append "intel_iommu=on" to /etc/default/grub
# grub2-mkconfig -o /boot/efi/EFI/redhat/grub.cfg

3. Destroy the L1 guest, attach one hostdev interface, start the guest, host will crash
# virsh dumpxml q | grep /interface -B6
    <interface type='hostdev' managed='yes'>
      <mac address='52:54:00:61:31:46'/>
      <source>
        <address type='pci' domain='0x0000' bus='0x05' slot='0x02' function='0x0'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
    </interface>

# virsh start q 
Domain q started

host will crash here, the kdump file vmcore-dmesg.txt attached.

Comment 9 yalzhang@redhat.com 2018-07-03 11:29:54 UTC
version for comment 8
# rpm -q libvirt qemu-kvm-rhev kernel
libvirt-4.4.0-2.el7.x86_64
qemu-kvm-rhev-2.12.0-6.el7.x86_64
kernel-3.10.0-907.el7.x86_64

Comment 10 Bandan Das 2018-08-07 19:25:13 UTC
(In reply to yalzhang from comment #8)
> Created attachment 1456199 [details]
> vmcore-dmesg.txt on host
> 
> I just encountered this failure, when start such a guest with hostdev
> interface(VF) or hostdev device(PF) will also crash the host, and it 100%
> happen with the configuration as below:
> 
Can you please post the qemu cmd line as well ?

Comment 11 yalzhang@redhat.com 2018-08-10 07:01:15 UTC
Hi Bandan, I have tried several times but only encountered the issue in comment 8 once, I couldn't recall what is special, but the issue in comment 0 is 100% reproducible, and I found it related with the caching_mode='on' option.

1. Start vm with iommu without caching_mode='on' together with hostdev device
# virsh dumpxml domain2_ovmf
...
<iommu model='intel'/>
...
<interface type='hostdev' managed='yes'>
      <mac address='52:54:00:63:f3:41'/>
      <source>
        <address type='pci' domain='0x0000' bus='0x03' slot='0x10' function='0x4'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x07' slot='0x00' function='0x0'/>
    </interface>
...
# virsh start domain2_ovmf
virsh Domain domain2_ovmf started

# virsh list 
 Id    Name                           State
----------------------------------------------------
 1     domain2_ovmf                   running

2. after a while, the domain is crashed
# virsh domstate domain2_ovmf --reason
shut off (crashed)

check the log:
# cat  /var/log/libvirt/qemu/domain2_ovmf.log
...
2018-08-10T06:34:39.748618Z qemu-kvm: We need to set caching-mode=1 for intel-iommu to enable device assignment with IOMMU protection.
2018-08-10 06:34:39.994+0000: shutting down, reason=crashed

# grep error /var/log/libvirt/libvirtd.log
2018-08-10 06:34:39.792+0000: 4967: error : qemuMonitorIO:718 : internal error: End of file from qemu monitor

Expected results:
libvirt should report error when we hotplug a hostdev device while there is iommu device without caching_mode=on; and libvirt should refuse to start a guest with hostdev device and iommu device lock caching_mode=on

Comment 12 Bandan Das 2018-08-10 16:56:43 UTC
Thank you. I will try to reproduce comment 0 then. Can you please post your qemu command line ? :) By command line, I mean the qemu invocation corresponding to your XML. You can just grep for the qemu process when you have the guest running.

I am not running with a XML but I am not able to reproduce this. I think taking a look at your qemu cmd line will help me figure out what I am doing different.

Comment 13 yalzhang@redhat.com 2018-08-13 01:38:52 UTC
Sorry, the qemu cmd as below:

# virsh start domain3_sea ; ps aux | grep domain3_sea | grep -v grep|sed 's/-device/\n-device/g' ; sleep 20; virsh domstate domain3_sea --reason
Domain domain3_sea started

qemu     31554 64.0 14.1 1913984 1139304 ?     SLl  09:35   0:00 /usr/libexec/qemu-kvm -name guest=domain3_sea,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-14-domain3_sea/master-key.aes -machine pc-q35-rhel7.6.0,accel=kvm,usb=off,vmport=off,dump-guest-core=off,kernel_irqchip=split -cpu Nehalem-IBRS -m 1024 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid d2b26621-7d34-4c98-a406-31b93e5cb0a8 -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=27,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -boot strict=on 
-device intel-iommu 
-device pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2 
-device i82801b11-bridge,id=pci.2,bus=pcie.0,addr=0x1e 
-device pci-bridge,chassis_nr=3,id=pci.3,bus=pci.2,addr=0x0 
-device pcie-root-port,port=0x11,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x1 
-device pcie-root-port,port=0x12,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x2 
-device pcie-root-port,port=0x13,chassis=6,id=pci.6,bus=pcie.0,addr=0x2.0x3 
-device pcie-root-port,port=0x14,chassis=7,id=pci.7,bus=pcie.0,addr=0x2.0x4 
-device ich9-usb-ehci1,id=usb,bus=pcie.0,addr=0x1d.0x7 
-device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pcie.0,multifunction=on,addr=0x1d 
-device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pcie.0,addr=0x1d.0x1 
-device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pcie.0,addr=0x1d.0x2 
-device virtio-serial-pci,id=virtio-serial0,bus=pci.4,addr=0x0 -drive file=/var/lib/libvirt/images/RHEL-7.6-x86_64-latest.qcow2.1,format=qcow2,if=none,id=drive-virtio-disk0 
-device virtio-blk-pci,scsi=off,bus=pci.5,addr=0x0,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -chardev pty,id=charserial0 
-device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,fd=29,server,nowait 
-device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel1,name=vdagent 
-device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 
-device usb-tablet,id=input0,bus=usb.0,port=1 -spice port=5900,addr=127.0.0.1,disable-ticketing,seamless-migration=on 
-device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pcie.0,addr=0x1 
-device intel-hda,id=sound0,bus=pci.3,addr=0x1 
-device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev spicevmc,id=charredir0,name=usbredir 
-device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 -chardev spicevmc,id=charredir1,name=usbredir 
-device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 
-device vfio-pci,host=03:10.6,id=hostdev0,bus=pci.1,addr=0x0 
-device virtio-balloon-pci,id=balloon0,bus=pci.6,addr=0x0 -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on
shut off (crashed)

Comment 15 yalzhang@redhat.com 2018-08-20 02:18:31 UTC
Hi Bandan, I have sent it, please check the mail. Let me know if you can not access it.


Note You need to log in before you can comment on or make changes to this bug.