Bug 1461709

Summary: Pod connectivity is broken after upgrade 3.5 to 3.6 with openshift-ovs-networkpolicy plugin
Product: OpenShift Container Platform Reporter: Yan Du <yadu>
Component: NetworkingAssignee: Dan Winship <danw>
Status: CLOSED WORKSFORME QA Contact: Meng Bo <bmeng>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.6.0CC: aos-bugs, bbennett, yadu
Target Milestone: ---Keywords: Reopened, UpcomingRelease
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-07-11 20:31:03 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
openflow none

Description Yan Du 2017-06-15 09:06:42 UTC
Description of problem:
Pod connectivity is broken after upgrade 3.5 to 3.6 with openshift-ovs-networkpolicy plugin


Version-Release number of selected component (if applicable):
upgrade from
openshift v3.5.5.26 
kubernetes v1.5.2+43a9be4 

to 
openshift v3.6.106
kubernetes v1.6.1+5115d708d7


How reproducible:
Always

Steps to Reproduce:
1. Setup an OCP 3.5 env with openshift-ovs-networkpolicy plugin
2. Create some pods in some project (pod could be connect each other)
3. Upgrade the env to OCP 3.6
4. Check the pod connectivity
5. Create some new pod in a new project
6. Check the pod connectivity again


Actual results:
pods can not connect to each other 

# oc get pod -o wide
NAME            READY     STATUS    RESTARTS   AGE       IP            NODE
test-rc-b3p0s   1/1       Running   0          56m       10.128.0.25   ip-172-18-13-27.ec2.internal
test-rc-mmrzs   1/1       Running   0          56m       10.128.0.24   ip-172-18-13-27.ec2.internal
# oc rsh test-rc-b3p0s
/ $ curl --connect-timeout 5 10.128.0.24:8080
curl: (28) Connection timed out after 5001 milliseconds


Expected results:
Pods could be connect to each other successfully, since the network between projects are flat without specific rules defined


Additional info:
Attached the openflow log

Comment 1 Yan Du 2017-06-15 09:08:00 UTC
Created attachment 1287964 [details]
openflow

Comment 2 Dan Winship 2017-06-16 20:25:19 UTC
openvswitch gets restarted as part of the upgrade, right? If so, this is a dup of bug 1453113

Comment 3 Ben Bennett 2017-06-19 15:34:23 UTC

*** This bug has been marked as a duplicate of bug 1453113 ***

Comment 4 Yan Du 2017-06-26 09:17:49 UTC
We just finish a round of upgrade testing today with multi-tenant plugin, and we didn't meet this issue. I think it may related the networkpolicy plugin and may not have the same root as #Bug 1453113. So move it to Assign in cases missing it.

Comment 5 Dan Winship 2017-06-26 14:04:06 UTC
After doing an upgrade, when connectivity is broken, can you do:

systemctl status openvswitch
systemctl status atomic-openshift-node

ovs-vsctl show

Comment 6 Yan Du 2017-06-29 07:59:55 UTC
Hi, Dan
I tried to upgrade OCP v3.5.5.28 to v3.6.126.1 with networkpolicy plugin, seems the issue could not be reproduced now, will let you know if I have more clues. Thanks.