Bug 1758330 - open PCI config file in read-only mode when possible
Summary: open PCI config file in read-only mode when possible
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Advanced Virtualization
Classification: Red Hat
Component: libvirt
Version: 8.1
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: 8.1
Assignee: Ján Tomko
QA Contact: jiyan
URL:
Whiteboard:
Depends On:
Blocks: 1758964
TreeView+ depends on / blocked
 
Reported: 2019-10-03 20:46 UTC by Ihar Hrachyshka
Modified: 2020-02-04 18:29 UTC (History)
9 users (show)

Fixed In Version: libvirt-5.6.0-10.el8
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-02-04 18:28:50 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2020:0404 0 None None None 2020-02-04 18:29:46 UTC

Description Ihar Hrachyshka 2019-10-03 20:46:57 UTC
Description of problem:

Currently, libvirt always opens PCI config files in write mode, which is a problem for environments where the daemon doesn't have write access to the files. This happens in CNV where libvirt is running inside a kubernetes pod that has /sys/ host path mounted into its containers as read-only.

In most cases, opening the file in write mode is not necessary, e.g. reset requests for VFIO registered devices are no-ops. But regardless, because libvirt always opens these files in write mode, kubevirt has to mount host /sys/devices path into virt-launcher (libvirt) pod to accommodate libvirt. This decision has security implications, and we would like to avoid allocating write access to /sys/devices subtree to these pods that are under control of users.

There is a series of patches already merged in upstream libvirt tree and released as part of 5.7.0 release that implements the enhancement request. Those are:

Author: Ján Tomko <jtomko>
Date:   Tue Aug 13 14:58:25 2019 +0200

    util: introduce virPCIDeviceConfigOpenInternal

    A thin wrapper to allow creating new functions.

    Signed-off-by: Ján Tomko <jtomko>
    Reviewed-by: Michal Privoznik <mprivozn>

Author: Ján Tomko <jtomko>
Date:   Tue Aug 13 15:07:53 2019 +0200

    util: Introduce virPCIDeviceConfigOpenWrite

    Only a handful of function need write access to the PCI config
    space. Create a wrapper function for those so that we can
    open it read only by default.

    Signed-off-by: Ján Tomko <jtomko>
    Reviewed-by: Michal Privoznik <mprivozn>

Author: Ján Tomko <jtomko>
Date:   Tue Aug 13 15:11:14 2019 +0200

    util: introduce readonly attribute to virPCIDeviceConfigOpenInternal

    Allow wrappers to open PCI config as read-only.

    Signed-off-by: Ján Tomko <jtomko>
    Reviewed-by: Michal Privoznik <mprivozn>

Author: Ján Tomko <jtomko>
Date:   Tue Aug 13 15:14:05 2019 +0200

    util: introduce virPCIDeviceConfigOpenTry

    For callers that only need read-only access and don't want
    an error reported.

    Signed-off-by: Ján Tomko <jtomko>
    Reviewed-by: Michal Privoznik <mprivozn>

commit e95f9459d3ae875d36df1699d919f0651b840109
Author: Ján Tomko <jtomko>
Date:   Tue Aug 13 15:17:44 2019 +0200

    util: default to read-only in virPCIDeviceConfigOpen

    All the callers left require virPCIDeviceConfigOpen to be fatal
    and only use read-only access to the config file.

    Signed-off-by: Ján Tomko <jtomko>
    Reviewed-by: Michal Privoznik <mprivozn>

I've tried libvirt 5.7.0 that includes these patches with kubevirt SR-IOV attached VMIs that use VFIO for SR-IOV VFs, and it seems to work.

Obviously, we can't just bump libvirt version in RHEL. So this bug is to ask if we can get these patches backported into RHEL libvirt version, so that in CNV we could remove /sys/devices mount from virt-launcher pod containers.

Comment 5 jiyan 2019-12-27 09:51:20 UTC
Reproduced this issue on libvirt-5.6.0-9.module+el8.1.1+4955+f0b25565.x86_64.

Version:
libvirt-5.6.0-9.module+el8.1.1+4955+f0b25565.x86_64
qemu-kvm-4.1.0-20.module+el8.1.1+5309+6d656f05.x86_64
kernel-4.18.0-147.3.1.el8_1.x86_64

Steps:
1. Configure vfs number and driver under the rw mode of sys
# mount |grep sysfs
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime,seclabel)

# lspci |grep 82599
82:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
82:00.1 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)

# echo 3 > /sys/devices/pci0000\:80/0000\:80\:02.0/0000\:82\:00.1/sriov_numvfs 

# echo "vfio-pci" > /sys/devices/pci0000\:80/0000\:80\:02.0/0000\:82\:00.1/virtfn0/driver_override

2. Check vf info through "lspci" and "virsh nodedev-dumpxml"
# lspci |grep 82599
82:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
82:00.1 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
82:10.1 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01)
82:10.3 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01)
82:10.5 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01)

# virsh nodedev-dumpxml pci_0000_82_00_1 
<device>
  <name>pci_0000_82_00_1</name>
  <path>/sys/devices/pci0000:80/0000:80:02.0/0000:82:00.1</path>
  <parent>pci_0000_80_02_0</parent>
...
    <capability type='virt_functions' maxCount='63'>
      <address domain='0x0000' bus='0x82' slot='0x10' function='0x1'/>
      <address domain='0x0000' bus='0x82' slot='0x10' function='0x3'/>
      <address domain='0x0000' bus='0x82' slot='0x10' function='0x5'/>
    </capability>
...
</device>

3. Detach a vf which will be used to attaching operation
# virsh nodedev-detach pci_0000_82_10_1 
Device pci_0000_82_10_1 detached

4. Change the sys with ro mode and restart libvirtd
# mount /sys -o remount,ro

# mount | grep sysfs
sysfs on /sys type sysfs (ro,nosuid,nodev,noexec,relatime,seclabel)

# systemctl restart libvirtd

5. Check pf dumpxml through "virsh nodedev-dumpxml" ==> Failed
# virsh nodedev-dumpxml pci_0000_82_00_1 
error: Could not find matching device 'pci_0000_82_00_1'
error: Node device not found: no node device with matching name 'pci_0000_82_00_1'

6. Prepare a running VM and vf xml, then trying to attach vf to VM ==> Failed
# virsh domstate test811 
running

# cat vf.xml 
<interface type='hostdev'>
  <source>
    <address type='pci' domain='0x0000' bus='0x82' slot='0x10' function='0x1'/>
  </source>
  <target dev='test'/>
  <mac address='52:54:00:98:c4:a8'/>
  <model type='virtio'/>
</interface>

# virsh attach-device test811 vf.xml 
error: Failed to attach device from vf.xml
error: Failed to open config space file '/sys/bus/pci/devices/0000:82:10.1/config': Read-only file system



Verified this bug on libvirt-5.6.0-10.module+el8.1.1+5309+6d656f05.x86_64
7. Update libvirt to libvirt-5.6.0-10 and restart libvirtd.
# yum update libvirt* -y

# rpm -qa libvirt
libvirt-5.6.0-10.module+el8.1.1+5309+6d656f05.x86_64

# systemctl restart libvirtd

# mount |grep sysfs
sysfs on /sys type sysfs (ro,nosuid,nodev,noexec,relatime,seclabel)

8. Check vf through "virsh nodedev-dumpxml"
# virsh nodedev-dumpxml pci_0000_82_00_1 ==> Succseeded witout err in step-5
<device>
  <name>pci_0000_82_00_1</name>
  <path>/sys/devices/pci0000:80/0000:80:02.0/0000:82:00.1</path>
  <parent>pci_0000_80_02_0</parent>
...
    <product id='0x10fb'>82599ES 10-Gigabit SFI/SFP+ Network Connection</product>
    <vendor id='0x8086'>Intel Corporation</vendor>
    <capability type='virt_functions' maxCount='63'>
      <address domain='0x0000' bus='0x82' slot='0x10' function='0x1'/>
      <address domain='0x0000' bus='0x82' slot='0x10' function='0x3'/>
      <address domain='0x0000' bus='0x82' slot='0x10' function='0x5'/>
    </capability>
...
</device>

9. Attach the vf to VM again, check related info ==> Succeeded without err in step-6
# virsh attach-device test811 vf.xml 
Device attached successfully

# virsh dumpxml test811 |grep "<interface" -A10
...
    <interface type='hostdev'>
      <mac address='52:54:00:98:c4:a8'/>
      <driver name='vfio'/>
      <source>
        <address type='pci' domain='0x0000' bus='0x82' slot='0x10' function='0x1'/>
      </source>
      <target dev='test'/>
      <model type='virtio'/>
      <alias name='hostdev0'/>
      <address type='pci' domain='0x0000' bus='0x07' slot='0x00' function='0x0'/>
    </interface>

# virsh console test811 
Connected to domain test811
Escape character is ^]

Red Hat Enterprise Linux 8.1 (Ootpa)
Kernel 4.18.0-147.el8.x86_64 on an x86_64

localhost login: root
Password: 
Last login: Fri Dec 27 14:28:53 on ttyS0
[root@localhost ~]# lspci |grep Eth
...
07:00.0 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01)

Comment 6 jiyan 2020-01-02 08:07:49 UTC
Triggered related auto jobs, and no problems found.
Mark this bug as verified.

Comment 8 errata-xmlrpc 2020-02-04 18:28:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0404


Note You need to log in before you can comment on or make changes to this bug.