Description of problem: Tried to run the following testing in a dpdk environment: Run the OSP10 minor update with ovs updated to latest 2.9. Before running minor update, there are existing dpdk instances with dpdkvhostuser mode interfaces created in default /var/run/openvswitch folder. Since ovs 2.9 supports both dpdkvhostuser and dpdkvhostuserclient modes, there will be another vhost folder created for dpdkvhostuserclient mode interfaces(/var/lib/vhost_socket). The permission problem happens with old dpdk instances after minor update & reboot that previous vhost-user interfaces in /var/run/openvswitch can NOT be accessed by qemu when nova start instances. As ovs is running as openvswitch:hugetlbfs user:group after rebooting, qemu doesn't have write permission to /var/run/openvswitch folder even if it belongs to hugetlbfs group. for example: [root@overcloud-compute-1 ~]# ls -al /var/run/openvswitch/ total 344 drwxr-xr-x. 2 openvswitch hugetlbfs 480 Apr 11 06:12 . drwxr-xr-x. 44 root root 1360 Apr 11 06:12 .. srwxr-x---. 1 openvswitch hugetlbfs 0 Apr 11 06:08 br-ex.mgmt srwxr-x---. 1 openvswitch hugetlbfs 0 Apr 11 06:08 br-ex.snoop srwxr-x---. 1 openvswitch hugetlbfs 0 Apr 11 06:08 br-int.mgmt srwxr-x---. 1 openvswitch hugetlbfs 0 Apr 11 06:08 br-int.snoop srwxr-x---. 1 openvswitch hugetlbfs 0 Apr 11 06:08 br-isolation.mgmt srwxr-x---. 1 openvswitch hugetlbfs 0 Apr 11 06:08 br-isolation.snoop srwxr-x---. 1 openvswitch hugetlbfs 0 Apr 11 06:09 br-link.mgmt srwxr-x---. 1 openvswitch hugetlbfs 0 Apr 11 06:08 br-link.snoop srwxr-x---. 1 openvswitch hugetlbfs 0 Apr 11 06:08 br-tun.mgmt srwxr-x---. 1 openvswitch hugetlbfs 0 Apr 11 06:08 br-tun.snoop srwxr-x---. 1 openvswitch hugetlbfs 0 Apr 11 06:08 db.sock srwxr-x---. 1 openvswitch hugetlbfs 0 Apr 11 06:08 ovsdb-server.1644.ctl -rw-r--r--. 1 openvswitch hugetlbfs 5 Apr 11 06:08 ovsdb-server.pid srwxr-x---. 1 openvswitch hugetlbfs 0 Apr 11 06:08 ovs-vswitchd.1723.ctl -rw-r--r--. 1 openvswitch hugetlbfs 5 Apr 11 06:08 ovs-vswitchd.pid -rw-r-----. 1 openvswitch hugetlbfs 208420 Apr 11 06:08 .rte_config -rw-r--r--. 1 openvswitch hugetlbfs 132608 Apr 11 06:08 .rte_hugepage_info srwxr-xr-x. 1 openvswitch hugetlbfs 0 Apr 11 06:08 .rte_mp_socket srwxr-xr-x. 1 openvswitch hugetlbfs 0 Apr 11 06:09 vhu43581aa8-a1 srwxr-xr-x. 1 openvswitch hugetlbfs 0 Apr 11 06:12 vhu9a715ff7-5e srwxr-xr-x. 1 openvswitch hugetlbfs 0 Apr 11 06:09 vhua200d9fe-b0 srwxr-xr-x. 1 openvswitch hugetlbfs 0 Apr 11 06:12 vhuec8308aa-d9 The problem can be solved by adding g+w to /var/run/openvswitch folder: # chmod g+w /var/run/openvswitch/ -R [root@overcloud-compute-1 ~]# ls -al /var/run/openvswitch/ total 344 drwxrwxr-x. 2 openvswitch hugetlbfs 480 Apr 11 06:12 . drwxr-xr-x. 44 root root 1360 Apr 11 06:20 .. srwxrwx---. 1 openvswitch hugetlbfs 0 Apr 11 06:08 br-ex.mgmt srwxrwx---. 1 openvswitch hugetlbfs 0 Apr 11 06:08 br-ex.snoop srwxrwx---. 1 openvswitch hugetlbfs 0 Apr 11 06:08 br-int.mgmt srwxrwx---. 1 openvswitch hugetlbfs 0 Apr 11 06:08 br-int.snoop srwxrwx---. 1 openvswitch hugetlbfs 0 Apr 11 06:08 br-isolation.mgmt srwxrwx---. 1 openvswitch hugetlbfs 0 Apr 11 06:08 br-isolation.snoop srwxrwx---. 1 openvswitch hugetlbfs 0 Apr 11 06:09 br-link.mgmt srwxrwx---. 1 openvswitch hugetlbfs 0 Apr 11 06:08 br-link.snoop srwxrwx---. 1 openvswitch hugetlbfs 0 Apr 11 06:08 br-tun.mgmt srwxrwx---. 1 openvswitch hugetlbfs 0 Apr 11 06:08 br-tun.snoop srwxrwx---. 1 openvswitch hugetlbfs 0 Apr 11 06:08 db.sock srwxrwx---. 1 openvswitch hugetlbfs 0 Apr 11 06:08 ovsdb-server.1644.ctl -rw-rw-r--. 1 openvswitch hugetlbfs 5 Apr 11 06:08 ovsdb-server.pid srwxrwx---. 1 openvswitch hugetlbfs 0 Apr 11 06:08 ovs-vswitchd.1723.ctl -rw-rw-r--. 1 openvswitch hugetlbfs 5 Apr 11 06:08 ovs-vswitchd.pid -rw-rw----. 1 openvswitch hugetlbfs 208420 Apr 11 06:08 .rte_config -rw-rw-r--. 1 openvswitch hugetlbfs 132608 Apr 11 06:08 .rte_hugepage_info srwxrwxr-x. 1 openvswitch hugetlbfs 0 Apr 11 06:08 .rte_mp_socket srwxrwxr-x. 1 openvswitch hugetlbfs 0 Apr 11 06:09 vhu43581aa8-a1 srwxrwxr-x. 1 openvswitch hugetlbfs 0 Apr 11 06:12 vhu9a715ff7-5e srwxrwxr-x. 1 openvswitch hugetlbfs 0 Apr 11 06:09 vhua200d9fe-b0 srwxrwxr-x. 1 openvswitch hugetlbfs 0 Apr 11 06:12 vhuec8308aa-d9 But this is not a good solution as it requires changing the file mode every time when ovs restarted or node gets rebooted because /var/run/openvswitch folder will be re-created when ovs gets restarted. One possible solution mentioned by skramaja is that ovs could support group write mode so that it can change the /var/run/openvswitch file mode automatically, but not sure if ovs can support this kind of changes. Reporting this bug against ovs to get evaluation first, please feel free to change owner if it can be solved in another way. [root@overcloud-compute-1 ~]# rpm -qa | grep openvswitch openvswitch-2.9.0-15.el7fdp.x86_64 Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
(In reply to zenghui.shi from comment #0) > > But this is not a good solution as it requires changing the file mode every > time when ovs restarted or node gets rebooted because /var/run/openvswitch > folder will be re-created when ovs gets restarted. > > One possible solution mentioned by skramaja is that ovs could support group > write mode so that it can change the /var/run/openvswitch file mode > automatically, but not sure if ovs can support this kind of changes. > > Reporting this bug against ovs to get evaluation first, please feel free to > change owner if it can be solved in another way. Lets put all the possible solutions so that we can asses the best suited one: * possibility of creating vhostsockets with group write access in ovs2.9 on reboot (after minor update) in the dpdkvhostuser node (ovs in server), this will allow qemu to work with the shared group name hugetlbfs * configure "user" as "openvswitch" in qemu.conf in OSP10 during this migration to support the existing VMs, Already validated by zenghui * possibility of migrating existing vms to other nodes and reboot and then migrate back to the same node. Assuming that migrate will re-create the vhost in the dpdkvhostuserclient mode. There is an known issue with migrating cpu pining vms, same set of cpus should be available in the target node. * create a service to ensure that we put g+w on every reboot till the existing sockets are present. We need to evaluate to come up with best possible solution.
What is the difference with bz#1548086?
(In reply to Timothy Redaelli from comment #3) > What is the difference with bz#1548086? bz#1548086 is a RHEL bz to modify the group value in qemu.conf default as "hugetlbfs", where as this bz is for the problem faced during update of ovs to 2.9 with exsiting VMs in dpdkvhostuser (ovs as server) mode and changing the default mode to dpdkvhostuserclient (ovs as client).
I guess one approach is to modify the RuntimeDirectory mode from 0755 to 0775. That will allow the vhost-user server socket files to be writable. That gets by the group permissions. Upstream, server mode vhost-user is deprecated in OvS, so it's good to migrate the VMs and migrate them back (so that they will use the newer OvS client mode vhost-socket).
Hi, Can we change the permission of the socket by setting UMask= in the ovs-vswitchd unit file?
can we use one of UMask and RuntimeDirectory to solve the problem ? or we shall use both.
UMask probably is a wrong change (and it was rejected upstream before) since it will impact all files created by the OvS daemon. This looks like https://bugzilla.redhat.com/show_bug.cgi?id=1515269 still isn't working. Saravan?
I think it's the same issue as described in https://bugzilla.redhat.com/show_bug.cgi?id=1515269#c3. Maybe we shall add the same script as we used to do in first-boot.yaml, but in the ovs package update path. + ovs_service_path="/usr/lib/systemd/system/ovs-vswitchd.service" + grep -q "RuntimeDirectoryMode=.*" $ovs_service_path + if [ "$?" -eq 0 ]; then + sed -i 's/RuntimeDirectoryMode=.*/RuntimeDirectoryMode=0775/' $ovs_service_path + else + echo "RuntimeDirectoryMode=0775" >> $ovs_service_path + fi
I am validating the migration to ensure that it works fine. If migration solves the issue, let discuss with PM to conclude how the OSP10.z8 should be placed. For the moment, changing the service file would be the last option.
(In reply to Saravanan KR from comment #10) > I am validating the migration to ensure that it works fine. If migration > solves the issue, let discuss with PM to conclude how the OSP10.z8 should be > placed. > > For the moment, changing the service file would be the last option. Hi Saravanan, I tested the same, there seems a qemu issue during the dpdk instance migration on src host: 2018-04-19 22:56:04.561+0000: initiating migration 2018-04-19T22:56:04.568554Z qemu-kvm: Failed to read msg header. Read -1 instead of 12. Original request 6. 2018-04-19T22:56:04.568734Z qemu-kvm: vhost_set_log_base failed: Input/output error (5) 2018-04-19T22:56:04.568788Z qemu-kvm: Failed to set msg fds. 2018-04-19T22:56:04.568803Z qemu-kvm: vhost_set_vring_addr failed: Invalid argument (22) 2018-04-19T22:56:04.568817Z qemu-kvm: Failed to set msg fds. 2018-04-19T22:56:04.568829Z qemu-kvm: vhost_set_vring_addr failed: Invalid argument (22) 2018-04-19T22:56:04.568842Z qemu-kvm: Failed to set msg fds. 2018-04-19T22:56:04.568853Z qemu-kvm: vhost_set_features failed: Invalid argument (22) 2018-04-19 22:56:04.786+0000: shutting down, reason=crashed [root@overcloud-compute-1 ~]# rpm -qa | grep 'openvswitch\|kernel\|qemu' kernel-devel-3.10.0-862.el7.x86_64 kernel-3.10.0-862.el7.x86_64 ipxe-roms-qemu-20170123-1.git4e85b27.el7_4.1.noarch qemu-kvm-common-rhev-2.10.0-21.el7.x86_64 openvswitch-ovn-central-2.6.1-16.git20161206.el7ost.x86_64 erlang-kernel-18.3.4.7-1.el7ost.x86_64 openvswitch-ovn-common-2.6.1-16.git20161206.el7ost.x86_64 libvirt-daemon-driver-qemu-3.9.0-14.el7.x86_64 openstack-neutron-openvswitch-9.4.1-12.el7ost.noarch openvswitch-2.6.1-16.git20161206.el7ost.x86_64 kernel-tools-libs-3.10.0-862.el7.x86_64 python-openvswitch-2.6.1-16.git20161206.el7ost.noarch qemu-kvm-rhev-2.10.0-21.el7.x86_64 kernel-headers-3.10.0-862.el7.x86_64 qemu-guest-agent-2.8.0-2.el7.x86_64 kernel-tools-3.10.0-862.el7.x86_64 qemu-img-rhev-2.10.0-21.el7.x86_64 openvswitch-ovn-host-2.6.1-16.git20161206.el7ost.x86_64 Meanwhile, I'm thinking if there is any migration related issue with current running ovs version, then there shall be two ovs update steps in minior update process in order to do a successful ovs update: 1) the minor update shall first update to a ovs version that has fix to the migration issue, then restart ovs or reboot in order to use the updated ovs to migrate instances. 2) if all the migration succeed, it then updates to latest ovs-2.9. this seems impossible as we cannot provide a repo of ovs that only have the migration fix(note, there will be a restart of ovs once the migration fix is applied, so we cannot rely on latest ovs to provide the fix even it shall be contained in latest ovs), but not the latest ovs. for example: https://bugzilla.redhat.com/show_bug.cgi?id=1450680 if customer is running a osp10 with ovs version < 2.6.1-28 , then it needs to go through above 1) and 2) steps. wdyt?
Once minor update complete, reboot initiated. vms are deployed and accessed, without any errors. logs show, ovs works in client mode.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:2101