Bug 1496700

Summary: RHOS 10 DPDK vhost_sockets directory wrong
Product: Red Hat OpenStack Reporter: Edu Alcaniz <ealcaniz>
Component: puppet-tripleoAssignee: Karthik Sundaravel <ksundara>
Status: CLOSED NOTABUG QA Contact: nlevinki <nlevinki>
Severity: high Docs Contact:
Priority: medium    
Version: 10.0 (Newton)CC: aasmith, aconole, ailan, amuller, apevec, astafeye, atelang, berrange, chrisw, ealcaniz, edannon, ekuris, fbaudin, fleitner, jjoyce, jschluet, kiyyappa, ksundara, lhh, lvrabec, mbabushk, mburns, mgrepl, morazi, mprivozn, nlevinki, nyechiel, oblaut, pablo.iranzo, rhallise, rhos-maint, samccann, sclewis, skramaja, slinaber, srevivo, tvignaud, twilson, vchundur, yrachman
Target Milestone: asyncKeywords: Triaged, ZStream
Target Release: 10.0 (Newton)   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1447112 Environment:
Last Closed: 2017-10-05 14:18:14 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1498515    

Comment 1 Edu Alcaniz 2017-09-28 07:38:18 UTC
Karthik Sundaravel 2017-09-28 03:22:05 EDT
1. Please check if the file /usr/lib/systemd/system/ovs-vswitchd.service in compute node has
RuntimeDirectoryMode=0775
Group=qemu
UMask=0002
2. Check if the file /usr/share/openvswitch/scripts/ovs-ctl in compute node has 
umask 0002 && start_daemon "$OVS_VSWITCHD_PRIORITY" "$OVS_VSWITCHD_WRAPPER" "$@" ||
in the function do_start_forwarding()

3. Please check if ovs-vsctl show throws any errors on the compute node with DPDK

4. Please add the SOS reports

Please note that this BZ is reported on OSP11.
OSP11 and later works with mode=server, while OSP10 works with mode=client, and the vhostuser socket directories differ in both cases.
So I think its appropriate to raise a new BZ

Comment 5 Karthik Sundaravel 2017-09-28 09:45:42 UTC
I couldn't find the below files in sos reports attached.

/usr/share/openvswitch/scripts/ovs-ctl
/usr/lib/systemd/system/ovs-vswitchd.service

Are these files missing after upgrading to ovs2.6 ?

Comment 10 Karthik Sundaravel 2017-09-28 10:44:33 UTC
The permission related changes are missing in the ovs-vswitchd.service and ovs-ctl files. Its similar to BZ https://bugzilla.redhat.com/show_bug.cgi?id=1413405.

Is the upgrade from OVS2.5 to OVS 2.6 done manually ?

Comment 11 Edu Alcaniz 2017-09-28 10:58:42 UTC
Per comments, it was made using OSPd. Yesterday in our remote session we moved the rights

/usr/bin/chown -R qemu:qemu /var/run/openvswitch 

but we didn't implement all the steps described:

 Here is the solution to the issue that we found:

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Here are the steps which we ran:

1) upgrade OVS (openvswitch and python-openvswitch) to 2.6 from fastpath repositories

2) run script to fix permissions (extracted from post-config)

3) modify ovs-vswitchd.service unit file with the following lines:
~~~
[Service]
ExecStartPre=-/usr/bin/chown -R root:qemu /var/run/openvswitch 
ExecStartPre=-/usr/bin/chmod 775 /var/run/openvswitch 
~~~

4) upgrade DPDK to 16.07 from brew (all packages)

5) reboot

6) Since the upgrade to OVS 2.6 - we need to run this command to enable dpdk, this is why dpdk0 was failing:
~~~
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true
~~~

Comment 12 Karthik Sundaravel 2017-09-28 11:04:55 UTC
the attachments ovs-vswitchd.service and ovs-ctl attached here does not reflect the changes as per steps [2] and [3]. Can you please paste the script used in step [2].

Comment 13 Yariv 2017-09-28 13:43:23 UTC
Can you try to explain the flow?

Initial Condition was RHOS 10 GA? with OVS-2.6?
Upgrade is major version change like 10 -> 11
Update is minor change from 10Ga -> 10z1

Please explain installed RHOS version.. which OVS version?
Which update took place? minor version ? or OVS only through FDP
What is RHEL version?

Comment 14 Sandra McCann 2017-09-28 15:21:57 UTC
Related to this issue - the documentation had an incorrect 'post-install.yaml' file.

Specifically, this function was missing:

function ovs_permission_fix() {
            ovs_service_path="/usr/lib/systemd/system/ovs-vswitchd.service"
            grep -q "RuntimeDirectoryMode=.*" $ovs_service_path
            if [ "$?" -eq 0 ]; then
              sed -i 's/RuntimeDirectoryMode=.*/RuntimeDirectoryMode=0775/' $ovs_service_path
            else
              echo "RuntimeDirectoryMode=0775" >> $ovs_service_path
            fi
              grep -Fxq "Group=qemu" $ovs_service_path
            if [ ! "$?" -eq 0 ]; then
              echo "Group=qemu" >> $ovs_service_path
            fi
            grep -Fxq "UMask=0002" $ovs_service_path
            if [ ! "$?" -eq 0 ]; then
              echo "UMask=0002" >> $ovs_service_path
            fi
            ovs_ctl_path='/usr/share/openvswitch/scripts/ovs-ctl'
            grep -q "umask 0002 \&\& start_daemon \"\$OVS_VSWITCHD_PRIORITY\"" $ovs_ctl_path
            if [ ! "$?" -eq 0 ]; then
              sed -i 's/start_daemon \"\$OVS_VSWITCHD_PRIORITY.*/umask 0002 \&\& start_daemon \"$OVS_VSWITCHD_PRIORITY\" \"$OVS_VSWITCHD_WRAPPER\" \"$@\"/' $ovs_ctl_path
            fi
        }



The content has been updated:
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/10/html-single/network_functions_virtualization_configuration_guide/#ap-rhosp10-ovs25-rhosp10-ovs26-post-install