Bug 1441605
Summary: | qemu crash when attach a hostdev device to the guest with intel-iommu device enabled | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | yafu <yafu> | ||||||
Component: | qemu-kvm-rhev | Assignee: | Bandan Das <bdas> | ||||||
Status: | CLOSED DEFERRED | QA Contact: | Pei Zhang <pezhang> | ||||||
Severity: | unspecified | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | 7.4 | CC: | ailan, alex.williamson, chayang, dyuan, jinzhao, juzhang, knoel, peterx, siliu, virt-maint, yafu, yalzhang, yfu, zpeng | ||||||
Target Milestone: | rc | ||||||||
Target Release: | --- | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2019-07-22 20:29:20 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | 1622209 | ||||||||
Bug Blocks: | 1473046 | ||||||||
Attachments: |
|
Description
yafu
2017-04-12 10:25:27 UTC
Update test result libvirt-3.2.0-5.el7.x86_64 qemu-kvm-rhev-2.9.0-4.el7.x86_64 # virsh list --all Id Name State ---------------------------------------------------- 17 q35-js running # cat vf.xml <interface type='hostdev' managed='yes'> <mac address='02:24:6b:89:bc:e9'/> <source> <address type='pci' domain='0x0000' bus='0x86' slot='0x10' function='0x1'/> </source> </interface> # virsh attach-device q35-js vf.xml Device attached successfully # virsh list --all Id Name State ---------------------------------------------------- - q35-js shut off # virsh domstate q35-js --reason shut off (crashed) Created attachment 1357315 [details]
VM xml file
Hi Jingjing, Yan,
Could you please share more details about your environments and xml info? As I can not reproduce this issue. Thanks.
Below is my testing:
Network cards:X540-AT2
VM XML: see attachment of this Comment.
Versions:
3.10.0-789.el7.x86_64
qemu-kvm-rhev-2.10.0-6.el7.x86_64
libvirt-3.9.0-2.el7.x86_64
Steps:
1. Boot VM with iommu, XML file is attached to this Comment.
2. Hot plug/unplug the PF network device:
# cat rhel7.5_nonrt_nic1.xml
<hostdev mode='subsystem' type='pci' managed='yes'>
<driver name='vfio'/>
<source>
<address domain='0x0000' bus='0x81' slot='0x00' function='0x0'/>
</source>
<address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
</hostdev>
# virsh attach-device rhel7.5_nonrt rhel7.5_nonrt_nic1.xml
Device attached successfully
# virsh detach-device rhel7.5_nonrt rhel7.5_nonrt_nic1.xml
Device detached successfully
3. Repeat step2 several (20) times, both host , qemu and guest work well.
(In reply to Pei Zhang from comment #6) > Created attachment 1357315 [details] > VM xml file > > Hi Jingjing, Yan, > > Could you please share more details about your environments and xml info? As > I can not reproduce this issue. Thanks. > > > Below is my testing: > > Network cards:X540-AT2 > VM XML: see attachment of this Comment. > > Versions: > 3.10.0-789.el7.x86_64 > qemu-kvm-rhev-2.10.0-6.el7.x86_64 > libvirt-3.9.0-2.el7.x86_64 > > Steps: > 1. Boot VM with iommu, XML file is attached to this Comment. > > 2. Hot plug/unplug the PF network device: > # cat rhel7.5_nonrt_nic1.xml > <hostdev mode='subsystem' type='pci' managed='yes'> > <driver name='vfio'/> > <source> > <address domain='0x0000' bus='0x81' slot='0x00' function='0x0'/> > </source> > <address type='pci' domain='0x0000' bus='0x04' slot='0x00' > function='0x0'/> > </hostdev> > > # virsh attach-device rhel7.5_nonrt rhel7.5_nonrt_nic1.xml > Device attached successfully > > # virsh detach-device rhel7.5_nonrt rhel7.5_nonrt_nic1.xml > Device detached successfully > > > 3. Repeat step2 several (20) times, both host , qemu and guest work well. The guest should enable iommu before attaching a hostdev device: (1)Edit file "/etc/default/grub": Append "intel_iommu=on" or "amd_iommu=on" to the value of "GRUB_CMDLINE_LINUX=......". (2)Then run the following command to generate the updated grub file: # grub2-mkconfig -o /boot/grub2/grub.cfg (3) Reboot the guest. Created attachment 1456199 [details]
vmcore-dmesg.txt on host
I just encountered this failure, when start such a guest with hostdev interface(VF) or hostdev device(PF) will also crash the host, and it 100% happen with the configuration as below:
1. On host, configure L1 guest as below:
# virsh dumpxml q
....
<os>
<type arch='x86_64' machine='pc-q35-rhel7.5.0'>hvm</type>
<loader readonly='yes' secure='yes' type='pflash'>/usr/share/OVMF/OVMF_CODE.secboot.fd</loader>
<nvram>/var/lib/libvirt/qemu/nvram/q_VARS.fd</nvram>
</os>
<features>
<acpi/>
<apic/>
<vmport state='off'/>
<smm state='on'/>
<ioapic driver='qemu'/>
</features>
<cpu mode='host-passthrough' check='none'>
<feature policy='require' name='vmx'/>
</cpu>
<devices>
...
<iommu model='intel'>
<driver intremap='on' caching_mode='on' iotlb='on'/>
</iommu>
...
</devices>
...
2. Start the L1 guest, and enable "intel_iommu=on" in guest kernel cmdline. append "intel_iommu=on" to /etc/default/grub
# grub2-mkconfig -o /boot/efi/EFI/redhat/grub.cfg
3. Destroy the L1 guest, attach one hostdev interface, start the guest, host will crash
# virsh dumpxml q | grep /interface -B6
<interface type='hostdev' managed='yes'>
<mac address='52:54:00:61:31:46'/>
<source>
<address type='pci' domain='0x0000' bus='0x05' slot='0x02' function='0x0'/>
</source>
<address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
</interface>
# virsh start q
Domain q started
host will crash here, the kdump file vmcore-dmesg.txt attached.
version for comment 8 # rpm -q libvirt qemu-kvm-rhev kernel libvirt-4.4.0-2.el7.x86_64 qemu-kvm-rhev-2.12.0-6.el7.x86_64 kernel-3.10.0-907.el7.x86_64 (In reply to yalzhang from comment #8) > Created attachment 1456199 [details] > vmcore-dmesg.txt on host > > I just encountered this failure, when start such a guest with hostdev > interface(VF) or hostdev device(PF) will also crash the host, and it 100% > happen with the configuration as below: > Can you please post the qemu cmd line as well ? Hi Bandan, I have tried several times but only encountered the issue in comment 8 once, I couldn't recall what is special, but the issue in comment 0 is 100% reproducible, and I found it related with the caching_mode='on' option. 1. Start vm with iommu without caching_mode='on' together with hostdev device # virsh dumpxml domain2_ovmf ... <iommu model='intel'/> ... <interface type='hostdev' managed='yes'> <mac address='52:54:00:63:f3:41'/> <source> <address type='pci' domain='0x0000' bus='0x03' slot='0x10' function='0x4'/> </source> <address type='pci' domain='0x0000' bus='0x07' slot='0x00' function='0x0'/> </interface> ... # virsh start domain2_ovmf virsh Domain domain2_ovmf started # virsh list Id Name State ---------------------------------------------------- 1 domain2_ovmf running 2. after a while, the domain is crashed # virsh domstate domain2_ovmf --reason shut off (crashed) check the log: # cat /var/log/libvirt/qemu/domain2_ovmf.log ... 2018-08-10T06:34:39.748618Z qemu-kvm: We need to set caching-mode=1 for intel-iommu to enable device assignment with IOMMU protection. 2018-08-10 06:34:39.994+0000: shutting down, reason=crashed # grep error /var/log/libvirt/libvirtd.log 2018-08-10 06:34:39.792+0000: 4967: error : qemuMonitorIO:718 : internal error: End of file from qemu monitor Expected results: libvirt should report error when we hotplug a hostdev device while there is iommu device without caching_mode=on; and libvirt should refuse to start a guest with hostdev device and iommu device lock caching_mode=on Thank you. I will try to reproduce comment 0 then. Can you please post your qemu command line ? :) By command line, I mean the qemu invocation corresponding to your XML. You can just grep for the qemu process when you have the guest running. I am not running with a XML but I am not able to reproduce this. I think taking a look at your qemu cmd line will help me figure out what I am doing different. Sorry, the qemu cmd as below: # virsh start domain3_sea ; ps aux | grep domain3_sea | grep -v grep|sed 's/-device/\n-device/g' ; sleep 20; virsh domstate domain3_sea --reason Domain domain3_sea started qemu 31554 64.0 14.1 1913984 1139304 ? SLl 09:35 0:00 /usr/libexec/qemu-kvm -name guest=domain3_sea,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-14-domain3_sea/master-key.aes -machine pc-q35-rhel7.6.0,accel=kvm,usb=off,vmport=off,dump-guest-core=off,kernel_irqchip=split -cpu Nehalem-IBRS -m 1024 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid d2b26621-7d34-4c98-a406-31b93e5cb0a8 -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=27,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -boot strict=on -device intel-iommu -device pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2 -device i82801b11-bridge,id=pci.2,bus=pcie.0,addr=0x1e -device pci-bridge,chassis_nr=3,id=pci.3,bus=pci.2,addr=0x0 -device pcie-root-port,port=0x11,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x1 -device pcie-root-port,port=0x12,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x2 -device pcie-root-port,port=0x13,chassis=6,id=pci.6,bus=pcie.0,addr=0x2.0x3 -device pcie-root-port,port=0x14,chassis=7,id=pci.7,bus=pcie.0,addr=0x2.0x4 -device ich9-usb-ehci1,id=usb,bus=pcie.0,addr=0x1d.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pcie.0,multifunction=on,addr=0x1d -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pcie.0,addr=0x1d.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pcie.0,addr=0x1d.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.4,addr=0x0 -drive file=/var/lib/libvirt/images/RHEL-7.6-x86_64-latest.qcow2.1,format=qcow2,if=none,id=drive-virtio-disk0 -device virtio-blk-pci,scsi=off,bus=pci.5,addr=0x0,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,fd=29,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel1,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 -device usb-tablet,id=input0,bus=usb.0,port=1 -spice port=5900,addr=127.0.0.1,disable-ticketing,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pcie.0,addr=0x1 -device intel-hda,id=sound0,bus=pci.3,addr=0x1 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 -device vfio-pci,host=03:10.6,id=hostdev0,bus=pci.1,addr=0x0 -device virtio-balloon-pci,id=balloon0,bus=pci.6,addr=0x0 -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on shut off (crashed) Hi Bandan, I have sent it, please check the mail. Let me know if you can not access it. |