Bug 1565967 - [ovs2.9] permission denied problem with old dpdk vhost interfaces after OSP10 minor update
Summary: [ovs2.9] permission denied problem with old dpdk vhost interfaces after OSP1...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 10.0 (Newton)
Hardware: x86_64
OS: Linux
high
high
Target Milestone: async
: 10.0 (Newton)
Assignee: Emilien Macchi
QA Contact: Maxim Babushkin
URL:
Whiteboard:
Depends On:
Blocks: 1578511
TreeView+ depends on / blocked
 
Reported: 2018-04-11 08:05 UTC by zenghui.shi
Modified: 2023-02-22 23:02 UTC (History)
13 users (show)

Fixed In Version: openstack-tripleo-heat-templates-5.3.10-4.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-06-27 23:30:46 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 564423 0 'None' MERGED Change ovs RuntimeDirectory Mode 2020-07-20 02:45:43 UTC
Red Hat Product Errata RHBA-2018:2101 0 None None None 2018-06-27 23:32:36 UTC

Description zenghui.shi 2018-04-11 08:05:36 UTC
Description of problem:

Tried to run the following testing in a dpdk environment:

Run the OSP10 minor update with ovs updated to latest 2.9.

Before running minor update, there are existing dpdk instances with dpdkvhostuser mode interfaces created in default /var/run/openvswitch folder.

Since ovs 2.9 supports both dpdkvhostuser and dpdkvhostuserclient modes, there will be another vhost folder created for dpdkvhostuserclient mode interfaces(/var/lib/vhost_socket).

The permission problem happens with old dpdk instances after minor update & reboot that previous vhost-user interfaces in /var/run/openvswitch can NOT be accessed by qemu when nova start instances. As ovs is running as openvswitch:hugetlbfs user:group after rebooting, qemu doesn't have write permission to /var/run/openvswitch folder even if it belongs to hugetlbfs group.
 
for example:

[root@overcloud-compute-1 ~]# ls -al /var/run/openvswitch/
total 344
drwxr-xr-x.  2 openvswitch hugetlbfs    480 Apr 11 06:12 .
drwxr-xr-x. 44 root        root        1360 Apr 11 06:12 ..
srwxr-x---.  1 openvswitch hugetlbfs      0 Apr 11 06:08 br-ex.mgmt
srwxr-x---.  1 openvswitch hugetlbfs      0 Apr 11 06:08 br-ex.snoop
srwxr-x---.  1 openvswitch hugetlbfs      0 Apr 11 06:08 br-int.mgmt
srwxr-x---.  1 openvswitch hugetlbfs      0 Apr 11 06:08 br-int.snoop
srwxr-x---.  1 openvswitch hugetlbfs      0 Apr 11 06:08 br-isolation.mgmt
srwxr-x---.  1 openvswitch hugetlbfs      0 Apr 11 06:08 br-isolation.snoop
srwxr-x---.  1 openvswitch hugetlbfs      0 Apr 11 06:09 br-link.mgmt
srwxr-x---.  1 openvswitch hugetlbfs      0 Apr 11 06:08 br-link.snoop
srwxr-x---.  1 openvswitch hugetlbfs      0 Apr 11 06:08 br-tun.mgmt
srwxr-x---.  1 openvswitch hugetlbfs      0 Apr 11 06:08 br-tun.snoop
srwxr-x---.  1 openvswitch hugetlbfs      0 Apr 11 06:08 db.sock
srwxr-x---.  1 openvswitch hugetlbfs      0 Apr 11 06:08 ovsdb-server.1644.ctl
-rw-r--r--.  1 openvswitch hugetlbfs      5 Apr 11 06:08 ovsdb-server.pid
srwxr-x---.  1 openvswitch hugetlbfs      0 Apr 11 06:08 ovs-vswitchd.1723.ctl
-rw-r--r--.  1 openvswitch hugetlbfs      5 Apr 11 06:08 ovs-vswitchd.pid
-rw-r-----.  1 openvswitch hugetlbfs 208420 Apr 11 06:08 .rte_config
-rw-r--r--.  1 openvswitch hugetlbfs 132608 Apr 11 06:08 .rte_hugepage_info
srwxr-xr-x.  1 openvswitch hugetlbfs      0 Apr 11 06:08 .rte_mp_socket
srwxr-xr-x.  1 openvswitch hugetlbfs      0 Apr 11 06:09 vhu43581aa8-a1
srwxr-xr-x.  1 openvswitch hugetlbfs      0 Apr 11 06:12 vhu9a715ff7-5e
srwxr-xr-x.  1 openvswitch hugetlbfs      0 Apr 11 06:09 vhua200d9fe-b0
srwxr-xr-x.  1 openvswitch hugetlbfs      0 Apr 11 06:12 vhuec8308aa-d9


The problem can be solved by adding g+w to /var/run/openvswitch folder:

# chmod g+w /var/run/openvswitch/ -R

[root@overcloud-compute-1 ~]# ls -al /var/run/openvswitch/
total 344
drwxrwxr-x.  2 openvswitch hugetlbfs    480 Apr 11 06:12 .
drwxr-xr-x. 44 root        root        1360 Apr 11 06:20 ..
srwxrwx---.  1 openvswitch hugetlbfs      0 Apr 11 06:08 br-ex.mgmt
srwxrwx---.  1 openvswitch hugetlbfs      0 Apr 11 06:08 br-ex.snoop
srwxrwx---.  1 openvswitch hugetlbfs      0 Apr 11 06:08 br-int.mgmt
srwxrwx---.  1 openvswitch hugetlbfs      0 Apr 11 06:08 br-int.snoop
srwxrwx---.  1 openvswitch hugetlbfs      0 Apr 11 06:08 br-isolation.mgmt
srwxrwx---.  1 openvswitch hugetlbfs      0 Apr 11 06:08 br-isolation.snoop
srwxrwx---.  1 openvswitch hugetlbfs      0 Apr 11 06:09 br-link.mgmt
srwxrwx---.  1 openvswitch hugetlbfs      0 Apr 11 06:08 br-link.snoop
srwxrwx---.  1 openvswitch hugetlbfs      0 Apr 11 06:08 br-tun.mgmt
srwxrwx---.  1 openvswitch hugetlbfs      0 Apr 11 06:08 br-tun.snoop
srwxrwx---.  1 openvswitch hugetlbfs      0 Apr 11 06:08 db.sock
srwxrwx---.  1 openvswitch hugetlbfs      0 Apr 11 06:08 ovsdb-server.1644.ctl
-rw-rw-r--.  1 openvswitch hugetlbfs      5 Apr 11 06:08 ovsdb-server.pid
srwxrwx---.  1 openvswitch hugetlbfs      0 Apr 11 06:08 ovs-vswitchd.1723.ctl
-rw-rw-r--.  1 openvswitch hugetlbfs      5 Apr 11 06:08 ovs-vswitchd.pid
-rw-rw----.  1 openvswitch hugetlbfs 208420 Apr 11 06:08 .rte_config
-rw-rw-r--.  1 openvswitch hugetlbfs 132608 Apr 11 06:08 .rte_hugepage_info
srwxrwxr-x.  1 openvswitch hugetlbfs      0 Apr 11 06:08 .rte_mp_socket
srwxrwxr-x.  1 openvswitch hugetlbfs      0 Apr 11 06:09 vhu43581aa8-a1
srwxrwxr-x.  1 openvswitch hugetlbfs      0 Apr 11 06:12 vhu9a715ff7-5e
srwxrwxr-x.  1 openvswitch hugetlbfs      0 Apr 11 06:09 vhua200d9fe-b0
srwxrwxr-x.  1 openvswitch hugetlbfs      0 Apr 11 06:12 vhuec8308aa-d9


But this is not a good solution as it requires changing the file mode every time when ovs restarted or node gets rebooted because /var/run/openvswitch folder will be re-created when ovs gets restarted.

One possible solution mentioned by skramaja is that ovs could support group write mode so that it can change the /var/run/openvswitch file mode automatically, but not sure if ovs can support this kind of changes.

Reporting this bug against ovs to get evaluation first, please feel free to change owner if it can be solved in another way.


[root@overcloud-compute-1 ~]# rpm -qa | grep openvswitch
openvswitch-2.9.0-15.el7fdp.x86_64



Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 2 Saravanan KR 2018-04-11 10:35:00 UTC
(In reply to zenghui.shi from comment #0)
> 
> But this is not a good solution as it requires changing the file mode every
> time when ovs restarted or node gets rebooted because /var/run/openvswitch
> folder will be re-created when ovs gets restarted.
> 
> One possible solution mentioned by skramaja is that ovs could support group
> write mode so that it can change the /var/run/openvswitch file mode
> automatically, but not sure if ovs can support this kind of changes.
> 
> Reporting this bug against ovs to get evaluation first, please feel free to
> change owner if it can be solved in another way.

Lets put all the possible solutions so that we can asses the best suited one:

* possibility of creating vhostsockets with group write access in ovs2.9 on reboot (after minor update) in the dpdkvhostuser node (ovs in server), this will allow qemu to work with the shared group name hugetlbfs

* configure "user" as "openvswitch" in qemu.conf in OSP10 during this migration to support the existing VMs, Already validated by zenghui

* possibility of migrating existing vms to other nodes and reboot and then migrate back to the same node. Assuming that migrate will re-create the vhost in the dpdkvhostuserclient mode. There is an known issue with migrating cpu pining vms, same set of cpus should be available in the target node.

* create a service to ensure that we put g+w on every reboot till the existing sockets are present.


We need to evaluate to come up with best possible solution.

Comment 3 Timothy Redaelli 2018-04-12 10:55:38 UTC
What is the difference with bz#1548086?

Comment 4 Saravanan KR 2018-04-12 11:19:57 UTC
(In reply to Timothy Redaelli from comment #3)
> What is the difference with bz#1548086?

bz#1548086 is a RHEL bz to modify the group value in qemu.conf default as "hugetlbfs", where as this bz is for the problem faced during update of ovs to 2.9 with exsiting VMs in dpdkvhostuser (ovs as server) mode and changing the default mode to dpdkvhostuserclient (ovs as client).

Comment 5 Aaron Conole 2018-04-12 14:13:50 UTC
I guess one approach is to modify the RuntimeDirectory mode from 0755 to 0775. That will allow the vhost-user server socket files to be writable.  That gets by the group permissions.  Upstream, server mode vhost-user is deprecated in OvS, so it's good to migrate the VMs and migrate them back (so that they will use the newer OvS client mode vhost-socket).

Comment 6 Matteo Croce 2018-04-12 17:11:35 UTC
Hi,

Can we change the permission of the socket by setting UMask= in the ovs-vswitchd unit file?

Comment 7 zenghui.shi 2018-04-16 07:09:25 UTC
can we use one of UMask and RuntimeDirectory to solve the problem ? or we shall use both.

Comment 8 Aaron Conole 2018-04-17 14:08:36 UTC
UMask probably is a wrong change (and it was rejected upstream before) since it will impact all files created by the OvS daemon.

This looks like https://bugzilla.redhat.com/show_bug.cgi?id=1515269 still isn't working.

Saravan?

Comment 9 zenghui.shi 2018-04-18 09:16:08 UTC
I think it's the same issue as described in https://bugzilla.redhat.com/show_bug.cgi?id=1515269#c3.

Maybe we shall add the same script as we used to do in first-boot.yaml, but in the ovs package update path.

+    ovs_service_path="/usr/lib/systemd/system/ovs-vswitchd.service"
+    grep -q "RuntimeDirectoryMode=.*" $ovs_service_path
+    if [ "$?" -eq 0 ]; then
+        sed -i 's/RuntimeDirectoryMode=.*/RuntimeDirectoryMode=0775/' $ovs_service_path
+    else
+        echo "RuntimeDirectoryMode=0775" >> $ovs_service_path
+    fi

Comment 10 Saravanan KR 2018-04-18 09:35:23 UTC
I am validating the migration to ensure that it works fine. If migration solves the issue, let discuss with PM to conclude how the OSP10.z8 should be placed.

For the moment, changing the service file would be the last option.

Comment 11 zenghui.shi 2018-04-19 23:12:39 UTC
(In reply to Saravanan KR from comment #10)
> I am validating the migration to ensure that it works fine. If migration
> solves the issue, let discuss with PM to conclude how the OSP10.z8 should be
> placed.
> 
> For the moment, changing the service file would be the last option.

Hi Saravanan, 

I tested the same, there seems a qemu issue during the dpdk instance migration on src host:

2018-04-19 22:56:04.561+0000: initiating migration
2018-04-19T22:56:04.568554Z qemu-kvm: Failed to read msg header. Read -1 instead of 12. Original request 6.
2018-04-19T22:56:04.568734Z qemu-kvm: vhost_set_log_base failed: Input/output error (5)
2018-04-19T22:56:04.568788Z qemu-kvm: Failed to set msg fds.
2018-04-19T22:56:04.568803Z qemu-kvm: vhost_set_vring_addr failed: Invalid argument (22)
2018-04-19T22:56:04.568817Z qemu-kvm: Failed to set msg fds.
2018-04-19T22:56:04.568829Z qemu-kvm: vhost_set_vring_addr failed: Invalid argument (22)
2018-04-19T22:56:04.568842Z qemu-kvm: Failed to set msg fds.
2018-04-19T22:56:04.568853Z qemu-kvm: vhost_set_features failed: Invalid argument (22)
2018-04-19 22:56:04.786+0000: shutting down, reason=crashed

[root@overcloud-compute-1 ~]# rpm -qa | grep 'openvswitch\|kernel\|qemu'
kernel-devel-3.10.0-862.el7.x86_64
kernel-3.10.0-862.el7.x86_64
ipxe-roms-qemu-20170123-1.git4e85b27.el7_4.1.noarch
qemu-kvm-common-rhev-2.10.0-21.el7.x86_64
openvswitch-ovn-central-2.6.1-16.git20161206.el7ost.x86_64
erlang-kernel-18.3.4.7-1.el7ost.x86_64
openvswitch-ovn-common-2.6.1-16.git20161206.el7ost.x86_64
libvirt-daemon-driver-qemu-3.9.0-14.el7.x86_64
openstack-neutron-openvswitch-9.4.1-12.el7ost.noarch
openvswitch-2.6.1-16.git20161206.el7ost.x86_64
kernel-tools-libs-3.10.0-862.el7.x86_64
python-openvswitch-2.6.1-16.git20161206.el7ost.noarch
qemu-kvm-rhev-2.10.0-21.el7.x86_64
kernel-headers-3.10.0-862.el7.x86_64
qemu-guest-agent-2.8.0-2.el7.x86_64
kernel-tools-3.10.0-862.el7.x86_64
qemu-img-rhev-2.10.0-21.el7.x86_64
openvswitch-ovn-host-2.6.1-16.git20161206.el7ost.x86_64


Meanwhile, I'm thinking if there is any migration related issue with current running ovs version, then there shall be two ovs update steps in minior update process in order to do a successful ovs update:

1) the minor update shall first update to a ovs version that has fix to the migration issue, then restart ovs or reboot in order to use the updated ovs to migrate instances. 
2) if all the migration succeed, it then updates to latest ovs-2.9. 

this seems impossible as we cannot provide a repo of ovs that only have the migration fix(note, there will be a restart of ovs once the migration fix is applied, so we cannot rely on latest ovs to provide the fix even it shall be contained in latest ovs), but not the latest ovs.

for example: https://bugzilla.redhat.com/show_bug.cgi?id=1450680

if customer is running a osp10 with ovs version < 2.6.1-28 , then it needs to go through above 1) and 2) steps.

wdyt?

Comment 15 Yariv 2018-06-21 22:44:23 UTC
Once minor update complete, reboot initiated.

vms are deployed and accessed, without any errors.
logs show, ovs works in client mode.

Comment 17 errata-xmlrpc 2018-06-27 23:30:46 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2101


Note You need to log in before you can comment on or make changes to this bug.