Bug 1406837
Summary: | Regression using vfio with mount namespaces enabled | ||
---|---|---|---|
Product: | [Community] Virtualization Tools | Reporter: | sL1pKn07 <sl1pkn07> |
Component: | libvirt | Assignee: | Michal Privoznik <mprivozn> |
Status: | CLOSED NEXTRELEASE | QA Contact: | |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | unspecified | CC: | berrange, dyuan, jishao, libvirt-maint, rbalakri, yanqzhan |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2017-01-04 14:43:13 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
sL1pKn07
2016-12-21 15:29:49 UTC
Ah, found the root cause. You are trying to assign two PCI devices which fall into the same IOMMU group. So while for the first one /dev/vfio/X entry is created, trying to do so for the second device fails as the path already exists. What's worse is that this can happen with other combinations of devices, e.g. RNG/chardev with /dev/null backend, as /dev/null is created regardless of domain configuration. For instance the following XML fails too: <rng model='virtio'> <backend model='random'>/dev/null</backend> <address type='pci' domain='0x0000' bus='0x06' slot='0x00' function='0x0'/> </rng> Weirdly, if this were hotplug you wouldn't see any error as EEXIST is correctly handled there. Patches posted on the upstream list: https://www.redhat.com/archives/libvir-list/2017-January/msg00073.html I've just pushed the patch upstream: commit 3aae99fe71ccee523bafeb54ebd0338eeed66868 Author: Michal Privoznik <mprivozn> AuthorDate: Wed Jan 4 13:57:06 2017 +0100 Commit: Michal Privoznik <mprivozn> CommitDate: Wed Jan 4 15:36:42 2017 +0100 qemu: Handle EEXIST gracefully in qemuDomainCreateDevice https://bugzilla.redhat.com/show_bug.cgi?id=1406837 Imagine you have a domain configured in such way that you are assigning two PCI devices that fall into the same IOMMU group. With mount namespace enabled what happens is that for the first PCI device corresponding /dev/vfio/X entry is created and when the code tries to do the same for the second mknod() fails as /dev/vfio/X already exists: 2016-12-21 14:40:45.648+0000: 24681: error : qemuProcessReportLogError:1792 : internal error: Process exited prior to exec: libvirt: QEMU Driver error : Failed to make device /var/run/libvirt/qemu/windoze.dev//vfio/22: File exists Worse, by default there are some devices that are created in the namespace regardless of domain configuration (e.g. /dev/null, /dev/urandom, etc.). If one of them is set as backend for some guest device (e.g. rng, chardev, etc.) it's the same story as described above. Weirdly, in attach code this is already handled. Signed-off-by: Michal Privoznik <mprivozn> v2.5.0-291-g3aae99fe71 seems working now (before edit the file qemu.conf for revert the changes in the namespaces part) but the directoty /var/run/libvirt/qemu/foo.dev/ and others is empty when the VM is launched is normal? greetings Yes. that is normal(In reply to sL1pKn07 from comment #4) > seems working now (before edit the file qemu.conf for revert the changes in > the namespaces part) Cool. > > but the directoty /var/run/libvirt/qemu/foo.dev/ and others is empty when > the VM is launched > > is normal? Yes. That is normal. Those dirs serve as a temporary point where real /dev/* mount points are moved. The idea is, we want /dev/* mount points to be shared with the parent namespace so that this namespace is transparent to the other applications. For instance, qemu creates /dev/pts/NNN for guest consoles. If /dev/pts/ would not be preserved, then applications have no way of attaching to the console. It's the same story with other mount points there. So just before building new /dev, all /dev/* mountpoints are moved to /var/run/libvirt/qemu/foo.* location, and moved back from there right after the /dev building is completed. Now that I am writing these lines I *think* those temp dirs can be safely removed once /dev building is completed. But that's out of scope of this bug. But I'll look into it. |