Bug 1461709

Summary:

Pod connectivity is broken after upgrade 3.5 to 3.6 with openshift-ovs-networkpolicy plugin

Product:

OpenShift Container Platform

Reporter:

Yan Du <yadu>

Component:

Networking

Assignee:

Dan Winship <danw>

Status:

CLOSED WORKSFORME

QA Contact:

Meng Bo <bmeng>

Severity:

medium

Docs Contact:

Priority:

medium

Version:

3.6.0

CC:

aos-bugs, bbennett, yadu

Target Milestone:

---

Keywords:

Reopened, UpcomingRelease

Target Release:

---

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

Doc Type:

If docs needed, set a value

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2017-07-11 20:31:03 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
openflow	none

Description Yan Du 2017-06-15 09:06:42 UTC

Description of problem:
Pod connectivity is broken after upgrade 3.5 to 3.6 with openshift-ovs-networkpolicy plugin


Version-Release number of selected component (if applicable):
upgrade from
openshift v3.5.5.26 
kubernetes v1.5.2+43a9be4 

to 
openshift v3.6.106
kubernetes v1.6.1+5115d708d7


How reproducible:
Always

Steps to Reproduce:
1. Setup an OCP 3.5 env with openshift-ovs-networkpolicy plugin
2. Create some pods in some project (pod could be connect each other)
3. Upgrade the env to OCP 3.6
4. Check the pod connectivity
5. Create some new pod in a new project
6. Check the pod connectivity again


Actual results:
pods can not connect to each other 

# oc get pod -o wide
NAME            READY     STATUS    RESTARTS   AGE       IP            NODE
test-rc-b3p0s   1/1       Running   0          56m       10.128.0.25   ip-172-18-13-27.ec2.internal
test-rc-mmrzs   1/1       Running   0          56m       10.128.0.24   ip-172-18-13-27.ec2.internal
# oc rsh test-rc-b3p0s
/ $ curl --connect-timeout 5 10.128.0.24:8080
curl: (28) Connection timed out after 5001 milliseconds


Expected results:
Pods could be connect to each other successfully, since the network between projects are flat without specific rules defined


Additional info:
Attached the openflow log

Comment 1 Yan Du 2017-06-15 09:08:00 UTC

Created attachment 1287964 [details]
openflow

Comment 2 Dan Winship 2017-06-16 20:25:19 UTC

openvswitch gets restarted as part of the upgrade, right? If so, this is a dup of bug 1453113

Comment 3 Ben Bennett 2017-06-19 15:34:23 UTC


*** This bug has been marked as a duplicate of bug 1453113 ***

Comment 4 Yan Du 2017-06-26 09:17:49 UTC

We just finish a round of upgrade testing today with multi-tenant plugin, and we didn't meet this issue. I think it may related the networkpolicy plugin and may not have the same root as #Bug 1453113. So move it to Assign in cases missing it.

Comment 5 Dan Winship 2017-06-26 14:04:06 UTC

After doing an upgrade, when connectivity is broken, can you do:

systemctl status openvswitch
systemctl status atomic-openshift-node

ovs-vsctl show

Comment 6 Yan Du 2017-06-29 07:59:55 UTC

Hi, Dan
I tried to upgrade OCP v3.5.5.28 to v3.6.126.1 with networkpolicy plugin, seems the issue could not be reproduced now, will let you know if I have more clues. Thanks.