Red Hat Bugzilla – Bug 1478791
Fixing the permission mismatch for DPDK vhost user ports with openvswitch and qemu
Last modified: 2017-11-20 08:19:55 EST
Description of problem:
Currently a workaround has been used to modifying the permission to make ovs to run as qemu group in TripleO, which is a intermediate solution.
Actual solution has been worked out by ovs team in https://mail.openvswitch.org/pipermail/ovs-dev/2017-June/333423.html
This is the BZ to track the upstream progress.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
Note that this solution has been accepted upstream, and requires that QEMU advertise the sockets with group permissions of +rw, and group ownership of hugetlbfs.
(In reply to Aaron Conole from comment #1)
> Note that this solution has been accepted upstream, and requires that QEMU
> advertise the sockets with group permissions of +rw, and group ownership of
Could you elaborate on QEMU advertising sockets with required permissions? Are you expecting any particular format or any pre-existing format? We need to add respective teams to continue discuss on it.
By advertise, what I mean is to just make sure that the file is group owned by hugetlbfs and has group permissions +rw.
There shouldn't be anything else needed from discretionary access controls.
Mandatory access controls (selinux) is different, and I am working with QE to figure out those issues now.
Thanks Aaron for the clarification.
There is an option qemu.conf to apply a group id to the qemu processes and its created files. I couldn't find an option to specify the vhost socket file permissions. Adding libvirt team to confirm whether this "group" option could be set as "hugetlbfs" for DPDK OpenStack deployment with "+rw".
Libvirt allows setting per-device DAC labels. However, because of lack of implementation we don't support it for vhostuser netdevs. Ideally, the XML config would look like this:
<source type='unix' path='/tmp/vhost1.sock' mode='server'/>
<seclabel type='static' model='dac' relabel='yes'>
So that the socket can be owned by correct owner. However, libvirt starts qemu with umask(0x002) and there's no way to specify the mode for files created by libvirt nor qemu in domain XML or a config file.
What is still needed from OvS side for this? Is anything?
From comment #5, it looks like the libvirt (qemu) may not be able to change the group ownership of the vhost-user sockets to hugetlbfs. I would prefer we agree on the way forward across components - ovs, libvirt, nova, tripleo (nfv). Let me know if a call is required to discuss or we can continue on this BZ itself.
Let's have a call to discuss. Please schedule it.
Minutes of discussion between Saravanan, Aaron, Michal and Karthik on agreement on the approach and the next steps:
ovs 2.8 has modified the permissions of ovs process - user as openvswitch and group as hugetlbfs by default. All vhost sockets created (in server mode) and opened (in client mode) will look for group permission as hugetlbfs.
From libvrit perspective following are the two approaches:
1) Create the domain xml with the specific permission
2) Configure qemu.conf's group  value to hugetlbfs to ensure vhost sockets are created in with specific group id, Aaron has commented that this has been tried standalone by QE and found to be working 
From libvirt's perspective, both changes will have same effect for the vhost user sockets. Already with OSP12, the qemu user id is changed to 42427  from the system's default value for the kolla containers. So effectively, qemu is running with uid/gid as 42427 in OSP12 (which is overridden for DPDK deployments in OSP12 , for mismatch)
I will try this approach to see if we can take approach (2) for qemu/libvirt and ovs will run as hugetlbfs. Based on the experiment, we will conclude on the approach.
Additionally, Aaron is working on configuring the user and group for ovs from /etc/sysconfig/openvswitch (work in progress?), which could also be worked up on if above approach fails.
On top of that we have to consider below upgrade scenarios for the analysis, to see how the change in user and group id in ovs2.8 and OSP13 should be handled:
a) OSP12 (baremetal) > OSP13 (baremetal)
b) OSP12 (containers) > OSP13 (containers)
c) OSP12 (baremetal) > OSP13 (containers)
a) OSP10 (baremetal) > OSP13 (baremetal)
c) OSP10 (baremetal) > OSP13 (containers)
This bug tracks the removal of THT change to patch the ovs service. It depends on https://bugzilla.redhat.com/show_bug.cgi?id=1515269