Description of problem:
OvS 2.8 by default runs with user openvswitch:openvswitch.
For DPDK, we need to run it as openvswitch:hugetlbfs.
* Fresh installation of ovs2.8 package will create these users and groups
* Need to configure OVS_USER_ID as openvswitch:hugetflfs in /etc/sysconfig/openvswitch
* Ensure right permissions for /etc/sysconfig/openvswitch
* Ensure right permissions for /var/lib/vhost_sockets with SELinux labels to the new users
* Upgrade - The users will not be modified by default on installing the package during upgrade/update. Need to conclude on what needs to be done.
Updating the outcome of discussions with Aaron Conole (ovs team) for others benefit.
* For rhel family, since the openvswitch package has always DPDK enabled, the default ownership for ovs will be "openvswitch:hugetlbfs" (irrespective of DPDK is enabled at runtime or not)
* Fresh ovs2.8 install and start will ensure that ovs is running with "openvswitch:hugetlbfs" ownership. But during the upgrade, there will not be any changes specific to ownership applied.
* IOMMU group devices are created with group "hugetlbfs" using udev rules , which is applied only on the fresh device creation. Not applicable for upgrade.
* Since ovs is not forcing change of ownership during upgrade, it is up to the user (tripleo) to decide whether it is required or not. If the new ownership is not applied in ovs and qemu, then the ovs service file patches for the vhost permission issue is required to be continued during upgrade, like .
* But if we decide to modify the permission during the upgrade, following are the list of change that are required:
useradd -r -d / -s /sbin/nologin -c "Open vSwitch Daemons" openvswitch
usermod -a -G hugetlbfs
# above cmds are not required if taken care in ovs, still discussing on it
chown openvswitch:hugetlbfs -R /etc/openvswitch
chown openvswitch:hugetlbfs -R /var/log/openvswitch
chown openvswitch:hugetlbfs -R /var/lib/vhost_sockets
chgrp -R hugetlbfs /dev/vfio
With above changes, i have verified it manually. And ofcourse the SELinux permission changes for /var/lib/vhost_sockets
During update/upgrade, changing permission requires lot of workarounds to be timing it with reboot and without out affecting existing VMs. As it is not a requirement to set new permissions during upgrade and it is not enforced by ovs, we choose to keep the same permission as existing (root:qemu) by patching the ovs service files (in post-install.yaml). This will avoid adding additions unknowns to the upgrade flow. (if there is a hard requirement to update it, it could be analyzed further).
Fresh install will have the new permissions set once, related templates are updated. A new environment file would be provided to be used for permission.
The current vhost socket directory /var/lib/vhost_directory need to be created with qemu:hugetlbfs permissions instead of qemu:qemu (puppet-tripleo). Need to check if SELinux could handle different permissions - qemu:qemu and qemu:hugetlbfs, as it is required for fresh and upgrade/update. Once confirmed, we could retain the same directory, else a new directory for queens will be introduced with the new set of permissions.
With the common group, the directory /var/run/openvswitch (with openvswitch:hugetlbfs permission) can be used by qemu to create vhost socket ports which could be shared with openvswitch. The catch is that the directory should have group (hugetlbfs) write permission.
default directory permissions
(overcloud) [root@overcloud-controller-0 heat-admin]# ll /var/run/openvswitch/ -d
drwxr-xr-x. 2 openvswitch hugetlbfs 260 Dec 18 07:49 /var/run/openvswitch/
modified directory permission and working
[root@overcloud-computeovsdpdk-0 ~]# ll /var/run/openvswitch -d
drwxrwxr-x. 2 openvswitch hugetlbfs 360 Dec 18 09:21 /var/run/openvswitch
And for containers we have to ensure that hugetlbfs gid value is same in host and container.
(In reply to Saravanan KR from comment #3)
> With the common group, the directory /var/run/openvswitch (with
> openvswitch:hugetlbfs permission) can be used by qemu to create vhost socket
> ports which could be shared with openvswitch. The catch is that the
> directory should have group (hugetlbfs) write permission.
> default directory permissions
> (overcloud) [root@overcloud-controller-0 heat-admin]# ll
> /var/run/openvswitch/ -d
> drwxr-xr-x. 2 openvswitch hugetlbfs 260 Dec 18 07:49 /var/run/openvswitch/
> modified directory permission and working
> [root@overcloud-computeovsdpdk-0 ~]# ll /var/run/openvswitch -d
> drwxrwxr-x. 2 openvswitch hugetlbfs 360 Dec 18 09:21 /var/run/openvswitch
> And for containers we have to ensure that hugetlbfs gid value is same in
> host and container.
Problem with this approach is, after reboot, the group write permission will not be persistent. Need to check if /var/lib/vhost_sockets directory could solve this problem.
The current change use service_config_settings to set the qemu group to hugetlbfs. But it will be applied all roles having compute service, which is not the expected behavior, even though it does cause any side effects.
Raised https://bugs.launchpad.net/tripleo/+bug/1747857 to fix it.
As stated in the mail, since there is mix and match for DPDK and non-DPDK VMs (SR-IOV and regular) in the same node, it is apt to keep the qemu running with "hugetlbfs" for all the VMs in the cluster when DPDK is enabled in one role. The current patch is applying it to all the nodes, so no further changes required.
Verified with the following
[root@computeovsdpdk-0 ~]# ls -ltrd /var/log/openvswitch
drwxr-x---. 2 openvswitch openvswitch 54 Jun 21 10:05 /var/log/openvswitch
[root@computeovsdpdk-0 ~]# ls -ltrd /etc/openvswitch
drwxr-xr-x. 2 openvswitch openvswitch 86 Jun 21 10:05 /etc/openvswitch
[root@computeovsdpdk-0 ~]# ls -ltrd /var/lib/vhost_sockets
drwxr-xr-x. 2 qemu hugetlbfs 6 Jun 21 10:15 /var/lib/vhost_sockets
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.