Description of problem: Currently, libvirt always opens PCI config files in write mode, which is a problem for environments where the daemon doesn't have write access to the files. This happens in CNV where libvirt is running inside a kubernetes pod that has /sys/ host path mounted into its containers as read-only. In most cases, opening the file in write mode is not necessary, e.g. reset requests for VFIO registered devices are no-ops. But regardless, because libvirt always opens these files in write mode, kubevirt has to mount host /sys/devices path into virt-launcher (libvirt) pod to accommodate libvirt. This decision has security implications, and we would like to avoid allocating write access to /sys/devices subtree to these pods that are under control of users. There is a series of patches already merged in upstream libvirt tree and released as part of 5.7.0 release that implements the enhancement request. Those are: Author: Ján Tomko <jtomko> Date: Tue Aug 13 14:58:25 2019 +0200 util: introduce virPCIDeviceConfigOpenInternal A thin wrapper to allow creating new functions. Signed-off-by: Ján Tomko <jtomko> Reviewed-by: Michal Privoznik <mprivozn> Author: Ján Tomko <jtomko> Date: Tue Aug 13 15:07:53 2019 +0200 util: Introduce virPCIDeviceConfigOpenWrite Only a handful of function need write access to the PCI config space. Create a wrapper function for those so that we can open it read only by default. Signed-off-by: Ján Tomko <jtomko> Reviewed-by: Michal Privoznik <mprivozn> Author: Ján Tomko <jtomko> Date: Tue Aug 13 15:11:14 2019 +0200 util: introduce readonly attribute to virPCIDeviceConfigOpenInternal Allow wrappers to open PCI config as read-only. Signed-off-by: Ján Tomko <jtomko> Reviewed-by: Michal Privoznik <mprivozn> Author: Ján Tomko <jtomko> Date: Tue Aug 13 15:14:05 2019 +0200 util: introduce virPCIDeviceConfigOpenTry For callers that only need read-only access and don't want an error reported. Signed-off-by: Ján Tomko <jtomko> Reviewed-by: Michal Privoznik <mprivozn> commit e95f9459d3ae875d36df1699d919f0651b840109 Author: Ján Tomko <jtomko> Date: Tue Aug 13 15:17:44 2019 +0200 util: default to read-only in virPCIDeviceConfigOpen All the callers left require virPCIDeviceConfigOpen to be fatal and only use read-only access to the config file. Signed-off-by: Ján Tomko <jtomko> Reviewed-by: Michal Privoznik <mprivozn> I've tried libvirt 5.7.0 that includes these patches with kubevirt SR-IOV attached VMIs that use VFIO for SR-IOV VFs, and it seems to work. Obviously, we can't just bump libvirt version in RHEL. So this bug is to ask if we can get these patches backported into RHEL libvirt version, so that in CNV we could remove /sys/devices mount from virt-launcher pod containers.
Reproduced this issue on libvirt-5.6.0-9.module+el8.1.1+4955+f0b25565.x86_64. Version: libvirt-5.6.0-9.module+el8.1.1+4955+f0b25565.x86_64 qemu-kvm-4.1.0-20.module+el8.1.1+5309+6d656f05.x86_64 kernel-4.18.0-147.3.1.el8_1.x86_64 Steps: 1. Configure vfs number and driver under the rw mode of sys # mount |grep sysfs sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime,seclabel) # lspci |grep 82599 82:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01) 82:00.1 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01) # echo 3 > /sys/devices/pci0000\:80/0000\:80\:02.0/0000\:82\:00.1/sriov_numvfs # echo "vfio-pci" > /sys/devices/pci0000\:80/0000\:80\:02.0/0000\:82\:00.1/virtfn0/driver_override 2. Check vf info through "lspci" and "virsh nodedev-dumpxml" # lspci |grep 82599 82:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01) 82:00.1 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01) 82:10.1 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01) 82:10.3 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01) 82:10.5 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01) # virsh nodedev-dumpxml pci_0000_82_00_1 <device> <name>pci_0000_82_00_1</name> <path>/sys/devices/pci0000:80/0000:80:02.0/0000:82:00.1</path> <parent>pci_0000_80_02_0</parent> ... <capability type='virt_functions' maxCount='63'> <address domain='0x0000' bus='0x82' slot='0x10' function='0x1'/> <address domain='0x0000' bus='0x82' slot='0x10' function='0x3'/> <address domain='0x0000' bus='0x82' slot='0x10' function='0x5'/> </capability> ... </device> 3. Detach a vf which will be used to attaching operation # virsh nodedev-detach pci_0000_82_10_1 Device pci_0000_82_10_1 detached 4. Change the sys with ro mode and restart libvirtd # mount /sys -o remount,ro # mount | grep sysfs sysfs on /sys type sysfs (ro,nosuid,nodev,noexec,relatime,seclabel) # systemctl restart libvirtd 5. Check pf dumpxml through "virsh nodedev-dumpxml" ==> Failed # virsh nodedev-dumpxml pci_0000_82_00_1 error: Could not find matching device 'pci_0000_82_00_1' error: Node device not found: no node device with matching name 'pci_0000_82_00_1' 6. Prepare a running VM and vf xml, then trying to attach vf to VM ==> Failed # virsh domstate test811 running # cat vf.xml <interface type='hostdev'> <source> <address type='pci' domain='0x0000' bus='0x82' slot='0x10' function='0x1'/> </source> <target dev='test'/> <mac address='52:54:00:98:c4:a8'/> <model type='virtio'/> </interface> # virsh attach-device test811 vf.xml error: Failed to attach device from vf.xml error: Failed to open config space file '/sys/bus/pci/devices/0000:82:10.1/config': Read-only file system Verified this bug on libvirt-5.6.0-10.module+el8.1.1+5309+6d656f05.x86_64 7. Update libvirt to libvirt-5.6.0-10 and restart libvirtd. # yum update libvirt* -y # rpm -qa libvirt libvirt-5.6.0-10.module+el8.1.1+5309+6d656f05.x86_64 # systemctl restart libvirtd # mount |grep sysfs sysfs on /sys type sysfs (ro,nosuid,nodev,noexec,relatime,seclabel) 8. Check vf through "virsh nodedev-dumpxml" # virsh nodedev-dumpxml pci_0000_82_00_1 ==> Succseeded witout err in step-5 <device> <name>pci_0000_82_00_1</name> <path>/sys/devices/pci0000:80/0000:80:02.0/0000:82:00.1</path> <parent>pci_0000_80_02_0</parent> ... <product id='0x10fb'>82599ES 10-Gigabit SFI/SFP+ Network Connection</product> <vendor id='0x8086'>Intel Corporation</vendor> <capability type='virt_functions' maxCount='63'> <address domain='0x0000' bus='0x82' slot='0x10' function='0x1'/> <address domain='0x0000' bus='0x82' slot='0x10' function='0x3'/> <address domain='0x0000' bus='0x82' slot='0x10' function='0x5'/> </capability> ... </device> 9. Attach the vf to VM again, check related info ==> Succeeded without err in step-6 # virsh attach-device test811 vf.xml Device attached successfully # virsh dumpxml test811 |grep "<interface" -A10 ... <interface type='hostdev'> <mac address='52:54:00:98:c4:a8'/> <driver name='vfio'/> <source> <address type='pci' domain='0x0000' bus='0x82' slot='0x10' function='0x1'/> </source> <target dev='test'/> <model type='virtio'/> <alias name='hostdev0'/> <address type='pci' domain='0x0000' bus='0x07' slot='0x00' function='0x0'/> </interface> # virsh console test811 Connected to domain test811 Escape character is ^] Red Hat Enterprise Linux 8.1 (Ootpa) Kernel 4.18.0-147.el8.x86_64 on an x86_64 localhost login: root Password: Last login: Fri Dec 27 14:28:53 on ttyS0 [root@localhost ~]# lspci |grep Eth ... 07:00.0 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01)
Triggered related auto jobs, and no problems found. Mark this bug as verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0404