Bug 1320430

Summary: Existing pods lose network connection after merge network
Product: OpenShift Container Platform Reporter: Meng Bo <bmeng>
Component: NetworkingAssignee: Ravi Sankar <rpenta>
Status: CLOSED ERRATA QA Contact: Meng Bo <bmeng>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.2.0CC: aos-bugs, bbennett, tdawson
Target Milestone: ---Keywords: Regression
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-05-12 16:33:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Meng Bo 2016-03-23 08:20:12 UTC
Description of problem:
Create some pods in different project, try to merge the network of the projects.
Do oadm pod-network join-projects --to pro1 pro2
or oadm pod-network make-projects-global pro2

The pods in project pro2 will loss network connection.


Version-Release number of selected component (if applicable):
openshift v3.2.0.6
kubernetes v1.2.0-36-g4a3f9c5


How reproducible:
always

Steps to Reproduce:
1. Setup multi node env with multi-tenant network 
2. Create some pods in different projects
3. Try to merge the network of the projects

Actual results:
The pods in the projects which has been modified will loss network connection.

Expected results:
The existing pods should have the correct function after modify network.


Additional info:
Check the openflow on the node, the pods' rules are removed and did not getting re-created.

Rules for pod with ip 10.129.0.4 in project3 disappeared after merged to project1
Rules for pod with ip 10.129.0.5 in project4 disappeared after make project4 global 

Though the new rules should be re-generated after the project network modified.

# ovs-ofctl dump-flows br0 -O openflow13 | grep "table=6\|table=7"
 cookie=0x0, duration=1560.735s, table=6, n_packets=2, n_bytes=84, priority=100,arp,arp_tpa=10.129.0.2 actions=output:41
 cookie=0x0, duration=1560.523s, table=6, n_packets=3, n_bytes=126, priority=100,arp,arp_tpa=10.129.0.3 actions=output:42
 cookie=0x0, duration=77043.018s, table=6, n_packets=84, n_bytes=3528, priority=0 actions=output:3
 cookie=0x0, duration=1560.732s, table=7, n_packets=0, n_bytes=0, priority=100,ip,reg0=0,nw_dst=10.129.0.2 actions=output:41
 cookie=0x0, duration=1560.730s, table=7, n_packets=0, n_bytes=0, priority=100,ip,reg0=0xd,nw_dst=10.129.0.2 actions=output:41
 cookie=0x0, duration=1560.521s, table=7, n_packets=0, n_bytes=0, priority=100,ip,reg0=0,nw_dst=10.129.0.3 actions=output:42
 cookie=0x0, duration=1560.519s, table=7, n_packets=8, n_bytes=679, priority=100,ip,reg0=0xc,nw_dst=10.129.0.3 actions=output:42
 cookie=0x0, duration=77043.016s, table=7, n_packets=86, n_bytes=6412, priority=0 actions=output:3

Comment 1 Ravi Sankar 2016-03-24 02:15:07 UTC
Fixed in https://github.com/openshift/openshift-sdn/pull/272

Comment 2 Troy Dawson 2016-03-30 18:58:02 UTC
Should be in atomic-openshift-3.2.0.9-1.git.0.b99af7d.el7, which is now built and in images.

Comment 3 Meng Bo 2016-03-31 09:33:55 UTC
The fix was not included in the latest 3.2.0.9 rpm build.

Assign it back.

Comment 4 Troy Dawson 2016-04-04 16:14:19 UTC
Should be in atomic-openshift-3.2.0.11-1.git.0.6696e29.el7, which is now built and in images.

Comment 5 Ravi Sankar 2016-04-04 18:13:38 UTC
Fix is merged in openshift-sdn repo but still not synced with origin repo (https://github.com/openshift/origin/pull/8333)

Comment 6 Troy Dawson 2016-04-04 18:53:57 UTC
I'm moving this back to Assigned.
Once it is in the origin repo, move it to Modified.

Comment 7 Ravi Sankar 2016-04-13 18:16:56 UTC
Merged into origin: https://github.com/openshift/origin/pull/8468

Comment 8 Meng Bo 2016-04-14 03:08:03 UTC
Move it to MODIFIED and wait for it is ready in latest OSE build.

Comment 9 Troy Dawson 2016-04-15 16:37:51 UTC
This should be in atomic-openshift-3.2.0.16-1.git.0.738b760.el7 which has been built and readied for qe.

Comment 10 Meng Bo 2016-04-18 06:59:13 UTC
Checked on AOS build 3.2.0.16, issue has been fixed.

Close the bug.

Comment 12 errata-xmlrpc 2016-05-12 16:33:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2016:1064