Bug 1566412

Summary: When restart openvswitch, ovn-controller is not reprogramming the flows.
Product: Red Hat OpenStack Reporter: Eran Kuris <ekuris>
Component: openvswitchAssignee: Aaron Conole <aconole>
Status: CLOSED DUPLICATE QA Contact: Ofer Blaut <oblaut>
Severity: medium Docs Contact:
Priority: medium    
Version: 13.0 (Queens)CC: aconole, amuller, apevec, chrisw, dalvarez, fhallal, lhh, majopela, nusiddiq, rhos-maint, srevivo, tredaelli
Target Milestone: ---Keywords: Triaged, ZStream
Target Release: 13.0 (Queens)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1572238 (view as bug list) Environment:
Last Closed: 2019-07-01 14:46:26 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Eran Kuris 2018-04-12 09:17:45 UTC
Description of problem:
I restart openvswitch service on compute/controller node and it looks like the open flows were deleted and not return cause ovn-controller is not reprogramming the flows.
when openvswitch service is restarted, it is expected that ovn-controller should reprogram the flows

when we don't have flow the system is down.
Workaround: 
The issue gets resolved only when ovn-controller is restarted

I expected that ovn will detect that the service was restarted and create the flow again

Version-Release number of selected component (if applicable):
osp13 

How reproducible:
100%

Steps to Reproduce:
1.deploy osp13 ovn ha setup (found on dvr setup)
2.systemctl restart openvswitch.service  (on compute / controler node)
3.ovs-ofctl dump-flows br-in   no flows 

4. restart ovn docker the flow will be reprogram

Comment 1 Eran Kuris 2018-04-12 09:24:01 UTC
the bug found on : 
[root@controller-0 ~]# cat /etc/yum.repos.d/latest-installed 
13   -p 2018-04-03.3
[root@controller-0 ~]# rpm -qa |grep ovn 
openvswitch-ovn-central-2.9.0-15.el7fdp.x86_64
openvswitch-ovn-common-2.9.0-15.el7fdp.x86_64
python-networking-ovn-4.0.1-0.20180315174741.a57c70e.el7ost.noarch
openvswitch-ovn-host-2.9.0-15.el7fdp.x86_64
openstack-nova-novncproxy-17.0.2-0.20180323024604.0390d5f.el7ost.noarch
novnc-0.6.1-1.el7ost.noarch
python-networking-ovn-metadata-agent-4.0.1-0.20180315174741.a57c70e.el7ost.noarch
puppet-ovn-12.3.1-0.20180221062110.4b16f7c.el7ost.noarch

Comment 2 Numan Siddique 2018-04-12 10:52:21 UTC
I think this needs to go to openvswitch component.

Comment 3 Numan Siddique 2018-04-12 13:05:48 UTC
I tested in a devstack environment and after restart of openvswitch, ovn-controller is programming the flows back.

In OSP13, ovn-controller is containerized and openvswitch is systemd service. When ovn-controller container is started as - docker start --net=host -v /run/openvswitch:/run/openvswitch ...

When openvswitch service is stopped, it deletes the /run/openvswitch dir and the run time files of ovn-controller files also gets deleted. When openvswitch service restarts, probably ovn-controller container is not able to access the /run/openvswitch/db.sock.

I will investigate it further.

Comment 7 Miguel Angel Ajo 2018-04-24 17:15:43 UTC
numan, you sent a patch for this, right'?

Comment 8 Numan Siddique 2018-04-26 10:01:19 UTC
Miguel yes. The fix right now is in tripleo - https://github.com/openstack/tripleo-heat-templates/commit/49963bc180139300afc1a968bfab3f2904f2c09d. Merged in master and queens branch.

 But I think it should be addressed properly in openvswitch. I remember seeing a patch upstream to handle this. I think Aaron is aware of it - https://mail.openvswitch.org/pipermail/ovs-dev/2018-April/346080.html


@Aaron - Do you think it can be fixed in ovs. Basically we don't want /run/openvswitch folder to be deleted when openvswitch service is stopped. In openstack OVN tripleo deployment, ovn-controller is started as a docker service. Earlier we were mounting /run/openvswitch host folder in the ovn-controller container. To fix this issue, we now mount /run. Please let me know if you want more information or context.

Comment 18 Aaron Conole 2018-11-05 15:57:37 UTC
Moving back to Assigned until completed.

Comment 19 Numan Siddique 2019-07-01 14:44:43 UTC
The proper solution to  this issue is to have a separate runtime dir for OVN.
The BZ - https://bugzilla.redhat.com/show_bug.cgi?id=1684419 is seen for the same reason.
So I think we can close this BZ in favor of https://bugzilla.redhat.com/show_bug.cgi?id=1684419.

Comment 20 Numan Siddique 2019-07-01 14:46:26 UTC

*** This bug has been marked as a duplicate of bug 1684419 ***