Description of problem: If caching mode isn't specified with -device intel-iommu and the user attempts to assign a host device, the guest will crash since Qemu aborts in such cases. As an example: XML snippet: ... <iommu model='intel'> <driver intremap='on' iotlb='on'/> </iommu> ... hostdev xml: #cat pci.xml <hostdev mode='subsystem' type='pci' managed='yes'> <source> <address domain='0x0000' bus='0x00' slot='0x19' function='0x0'/> </source> </hostdev> Attach the device: #virsh attach-device rhel7.3 pci.xml The guest will crash with the following message in the qemu log: Device at bus pci.8 addr 00.0 requires iommu notifier which is currently not supported by intel-iommu emulation 2017-04-12 03:07:26.093+0000: shutting down, reason=crashed It will be helpful if libvirt checks the absence of caching-mode in the XML and provides a meaningful message to the user if vfio device assignment is used.
The error message specifically says: Device at bus pci.8 addr 00.0 requires iommu notifier which is *currently* not supported by intel-iommu emulation So it gives the impression that this can be implemented in the future. To avoid the embarrassment of bug 1433994: Can this configuration possibly work in the future?
(In reply to Ján Tomko from comment #2) > The error message specifically says: > Device at bus pci.8 addr 00.0 requires iommu notifier which is *currently* > not supported by intel-iommu emulation This is strange since at least upstream and RHEL7.6/RHEL8 should post an error like this (which is much clearer and this error should be there for a long time): if (!s->caching_mode && new & IOMMU_NOTIFIER_MAP) { error_report("We need to set caching-mode=1 for intel-iommu to enable " "device assignment with IOMMU protection."); exit(1); } I don't know where that "... currently not supported ..." come from... > > So it gives the impression that this can be implemented in the future. > > To avoid the embarrassment of bug 1433994: > Can this configuration possibly work in the future? As the (latest) error message mentioned, it will never work if without caching-mode=on. Thanks, Peter
Yes, regarding bug 1441605, I don't think comment 1 is relevant anymore. Comment 11 is and shows the actual error message.
Hi Bandan, Do we need to file a bz against this issue upon qemu component? As qemu crash when hot plug a vfio device without caching-mode=on. (qemu) device_add vfio-pci,host=0000:5e:00.0,bus=pci.5 We need to set caching-mode=1 for intel-iommu to enable device assignment with IOMMU protection. In my understanding, vfio needs caching-mode=on, so this is not a valid usage, however qemu should not quit, perhaps some warning info is more friendly. What do you think? Thank you. Best regards, Pei
(In reply to Pei Zhang from comment #6) > Hi Bandan, > > Do we need to file a bz against this issue upon qemu component? As qemu > crash when hot plug a vfio device without caching-mode=on. > > (qemu) device_add vfio-pci,host=0000:5e:00.0,bus=pci.5 > We need to set caching-mode=1 for intel-iommu to enable device assignment > with IOMMU protection. > > In my understanding, vfio needs caching-mode=on, so this is not a valid > usage, however qemu should not quit, perhaps some warning info is more > friendly. What do you think? > > Thank you. > > Best regards, > > Pei We had bug 1441605 against rhel7 that was closed. I am not sure there's an easy way to return back an error message to the monitor and the best option seems to be for libvirt to parse the input and return an error.
(In reply to Bandan Das from comment #7) [...] > > We had bug 1441605 against rhel7 that was closed. I am not sure there's an > easy way > to return back an error message to the monitor and the best option seems to > be > for libvirt to parse the input and return an error. Hi Bandan, After more testing, we found qemu shows conflict behaviors without "caching-mode=on" in different scenarios, so we filed a new bug to track it: Bug 1738440 - For intel-iommu, qemu shows conflict behaviors between booting a guest with vfio and hot plugging vfio device Thank you. Best regards, Pei
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release. Therefore, it is being closed. If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.