1461709 – Pod connectivity is broken after upgrade 3.5 to 3.6 with openshift-ovs-networkpolicy plugin

Bug 1461709 - Pod connectivity is broken after upgrade 3.5 to 3.6 with openshift-ovs-networkpolicy plugin

Summary: Pod connectivity is broken after upgrade 3.5 to 3.6 with openshift-ovs-networ...

Keywords:
Status:	CLOSED WORKSFORME
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Networking
Sub Component:
Version:	3.6.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Dan Winship
QA Contact:	Meng Bo
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2017-06-15 09:06 UTC by Yan Du
Modified:	2017-07-11 20:31 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2017-07-11 20:31:03 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
openflow (18.29 KB, text/plain) 2017-06-15 09:08 UTC, Yan Du	no flags	Details
View All

Description Yan Du 2017-06-15 09:06:42 UTC

Description of problem:
Pod connectivity is broken after upgrade 3.5 to 3.6 with openshift-ovs-networkpolicy plugin


Version-Release number of selected component (if applicable):
upgrade from
openshift v3.5.5.26 
kubernetes v1.5.2+43a9be4 

to 
openshift v3.6.106
kubernetes v1.6.1+5115d708d7


How reproducible:
Always

Steps to Reproduce:
1. Setup an OCP 3.5 env with openshift-ovs-networkpolicy plugin
2. Create some pods in some project (pod could be connect each other)
3. Upgrade the env to OCP 3.6
4. Check the pod connectivity
5. Create some new pod in a new project
6. Check the pod connectivity again


Actual results:
pods can not connect to each other 

# oc get pod -o wide
NAME            READY     STATUS    RESTARTS   AGE       IP            NODE
test-rc-b3p0s   1/1       Running   0          56m       10.128.0.25   ip-172-18-13-27.ec2.internal
test-rc-mmrzs   1/1       Running   0          56m       10.128.0.24   ip-172-18-13-27.ec2.internal
# oc rsh test-rc-b3p0s
/ $ curl --connect-timeout 5 10.128.0.24:8080
curl: (28) Connection timed out after 5001 milliseconds


Expected results:
Pods could be connect to each other successfully, since the network between projects are flat without specific rules defined


Additional info:
Attached the openflow log

Comment 1 Yan Du 2017-06-15 09:08:00 UTC

Created attachment 1287964 [details]
openflow

Comment 2 Dan Winship 2017-06-16 20:25:19 UTC

openvswitch gets restarted as part of the upgrade, right? If so, this is a dup of bug 1453113

Comment 3 Ben Bennett 2017-06-19 15:34:23 UTC


*** This bug has been marked as a duplicate of bug 1453113 ***

Comment 4 Yan Du 2017-06-26 09:17:49 UTC

We just finish a round of upgrade testing today with multi-tenant plugin, and we didn't meet this issue. I think it may related the networkpolicy plugin and may not have the same root as #Bug 1453113. So move it to Assign in cases missing it.

Comment 5 Dan Winship 2017-06-26 14:04:06 UTC

After doing an upgrade, when connectivity is broken, can you do:

systemctl status openvswitch
systemctl status atomic-openshift-node

ovs-vsctl show

Comment 6 Yan Du 2017-06-29 07:59:55 UTC

Hi, Dan
I tried to upgrade OCP v3.5.5.28 to v3.6.126.1 with networkpolicy plugin, seems the issue could not be reproduced now, will let you know if I have more clues. Thanks.

Note You need to log in before you can comment on or make changes to this bug.