1389213 – Cannot merge network / make project global via the oadm pod-network tool

Bug 1389213 - Cannot merge network / make project global via the oadm pod-network tool

Summary: Cannot merge network / make project global via the oadm pod-network tool

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Networking
Sub Component:
Version:	3.4.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	Ravi Sankar
QA Contact:	zhaozhanqi
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2016-10-27 08:13 UTC by Meng Bo
Modified:	2017-03-08 18:43 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:	Cause: Wrong field passed to UpdatePod Consequence: The network namespace was not correctly merged because the string passed is invalid. Fix: Passed the correct field Result: The network namespaces are correctly merged.
Clone Of:
Environment:
Last Closed:	2017-01-18 12:47:02 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
Full dump of openflow (12.20 KB, text/plain) 2016-10-27 08:18 UTC, Meng Bo	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Origin (Github)	11679	0	None	None	None	2016-11-03 14:46:03 UTC
Red Hat Product Errata	RHBA-2017:0066	0	normal	SHIPPED_LIVE	Red Hat OpenShift Container Platform 3.4 RPM Release Advisory	2017-01-18 17:23:26 UTC

Description Meng Bo 2016-10-27 08:13:53 UTC

Description of problem:
Try to merge the network between different projects, the netnamespace has been changed in the registry but the openflow rules on node are not changed.

Version-Release number of selected component (if applicable):
openshift v3.4.0.16+cc70b72
kubernetes v1.4.0+776c994
ovs-vsctl (Open vSwitch) 2.4.0


How reproducible:
always

Steps to Reproduce:
1. Setup multi-node env with multitenant network plugin
2. Create some projects 
3. Create pods in each project
4. Merge the network via oadm pod-network tool via cluster admin
5. Try to access the pod in project which has been merged
6. Check the netnamespace via cluster admin

Actual results:
Cannot access the pod in the other project which has been merged into the current project.

Expected results:
Should be able to merge project network or make project global. 


Additional info:
# oc get netnamespace
NAME               NETID
bmengp1            13732668
bmengp2            13732668
default            0
kube-system        8358622
logging            1324284
management-infra   13148104
openshift          6940502
openshift-infra    11870335

# ovs-ofctl dump-flows br0 -O openflow13 'table=2'
OFPST_FLOW reply (OF1.3) (xid=0x2):
 cookie=0x0, duration=1399.932s, table=2, n_packets=1, n_bytes=42, priority=100,arp,in_port=101,arp_spa=10.1.0.100,arp_sha=36:26:1d:2f:52:98 actions=load:0->NXM_NX_RE
G0[],goto_table:5
 cookie=0x0, duration=1399.501s, table=2, n_packets=1, n_bytes=42, priority=100,arp,in_port=102,arp_spa=10.1.0.101,arp_sha=aa:1b:ba:12:18:b0 actions=load:0->NXM_NX_RE
G0[],goto_table:5
 cookie=0x0, duration=355.139s, table=2, n_packets=0, n_bytes=0, priority=100,arp,in_port=112,arp_spa=10.1.0.111,arp_sha=82:c3:3e:17:fe:bb actions=load:0xd18b3c->NXM_NX_REG0[],goto_table:5
 cookie=0x0, duration=354.534s, table=2, n_packets=0, n_bytes=0, priority=100,arp,in_port=113,arp_spa=10.1.0.112,arp_sha=6e:be:33:88:57:89 actions=load:0xd18b3c->NXM_NX_REG0[],goto_table:5
 cookie=0x0, duration=349.982s, table=2, n_packets=0, n_bytes=0, priority=100,arp,in_port=114,arp_spa=10.1.0.113,arp_sha=ca:22:7b:cd:04:24 actions=load:0xc56116->NXM_NX_REG0[],goto_table:5
 cookie=0x0, duration=349.200s, table=2, n_packets=0, n_bytes=0, priority=100,arp,in_port=115,arp_spa=10.1.0.114,arp_sha=b2:f6:a9:45:f9:88 actions=load:0xc56116->NXM_NX_REG0[],goto_table:5
 cookie=0x0, duration=247.751s, table=2, n_packets=0, n_bytes=0, priority=100,arp,in_port=116,arp_spa=10.1.0.115,arp_sha=3e:67:6c:0a:3f:c7 actions=load:0xc56116->NXM_NX_REG0[],goto_table:5
 cookie=0x0, duration=247.237s, table=2, n_packets=0, n_bytes=0, priority=100,arp,in_port=117,arp_spa=10.1.0.116,arp_sha=c6:27:da:54:ab:b4 actions=load:0xc56116->NXM_NX_REG0[],goto_table:5
 cookie=0x0, duration=224.263s, table=2, n_packets=0, n_bytes=0, priority=100,arp,in_port=118,arp_spa=10.1.0.117,arp_sha=26:c0:69:54:ee:fa actions=load:0xd18b3c->NXM_NX_REG0[],goto_table:5
 cookie=0x0, duration=223.640s, table=2, n_packets=0, n_bytes=0, priority=100,arp,in_port=119,arp_spa=10.1.0.118,arp_sha=02:4d:8e:19:c7:0b actions=load:0xd18b3c->NXM_NX_REG0[],goto_table:5
 cookie=0x0, duration=1399.930s, table=2, n_packets=1969, n_bytes=177506, priority=100,ip,in_port=101,nw_src=10.1.0.100 actions=load:0->NXM_NX_REG0[],goto_table:3
 cookie=0x0, duration=1399.495s, table=2, n_packets=14375, n_bytes=5402507, priority=100,ip,in_port=102,nw_src=10.1.0.101 actions=load:0->NXM_NX_REG0[],goto_table:3
 cookie=0x0, duration=355.137s, table=2, n_packets=0, n_bytes=0, priority=100,ip,in_port=112,nw_src=10.1.0.111 actions=load:0xd18b3c->NXM_NX_REG0[],goto_table:3
 cookie=0x0, duration=354.524s, table=2, n_packets=0, n_bytes=0, priority=100,ip,in_port=113,nw_src=10.1.0.112 actions=load:0xd18b3c->NXM_NX_REG0[],goto_table:3
 cookie=0x0, duration=349.968s, table=2, n_packets=0, n_bytes=0, priority=100,ip,in_port=114,nw_src=10.1.0.113 actions=load:0xc56116->NXM_NX_REG0[],goto_table:3
 cookie=0x0, duration=349.171s, table=2, n_packets=0, n_bytes=0, priority=100,ip,in_port=115,nw_src=10.1.0.114 actions=load:0xc56116->NXM_NX_REG0[],goto_table:3
 cookie=0x0, duration=247.748s, table=2, n_packets=0, n_bytes=0, priority=100,ip,in_port=116,nw_src=10.1.0.115 actions=load:0xc56116->NXM_NX_REG0[],goto_table:3
 cookie=0x0, duration=247.235s, table=2, n_packets=0, n_bytes=0, priority=100,ip,in_port=117,nw_src=10.1.0.116 actions=load:0xc56116->NXM_NX_REG0[],goto_table:3
 cookie=0x0, duration=224.231s, table=2, n_packets=0, n_bytes=0, priority=100,ip,in_port=118,nw_src=10.1.0.117 actions=load:0xd18b3c->NXM_NX_REG0[],goto_table:3
 cookie=0x0, duration=223.638s, table=2, n_packets=0, n_bytes=0, priority=100,ip,in_port=119,nw_src=10.1.0.118 actions=load:0xd18b3c->NXM_NX_REG0[],goto_table:3
 cookie=0x0, duration=1424.013s, table=2, n_packets=72, n_bytes=23106, priority=0 actions=drop

Comment 1 Meng Bo 2016-10-27 08:18:12 UTC

Created attachment 1214504 [details]
Full dump of openflow

Comment 2 Ravi Sankar 2016-10-31 23:25:49 UTC

Fixed in https://github.com/openshift/origin/pull/11679

Comment 3 Ben Bennett 2016-11-01 13:03:59 UTC

Can't move to Modified until it has merged.

Comment 4 openshift-github-bot 2016-11-02 22:05:49 UTC

Commit pushed to master at https://github.com/openshift/origin

https://github.com/openshift/origin/commit/0f6ac87eab1cc4d8c004c54cb8d581820ea122c8
Bug 1389213 - Fix join/isolate project network

Pass kubeletTypes.ContainerID.ID instead of kubeletTypes.ContainerID.String() to UpdatePod(),
Otherwise docker client fails with error: no such container '://<id>'

Comment 6 zhaozhanqi 2016-11-04 07:02:51 UTC

Tested this issue on 





# oc get netnamespaces
NAME                           NETID
default                        0
kube-system                    6030722
network-diag-global-ns-546o4   0
network-diag-global-ns-uqusd   0
network-diag-ns-3c96g          2817809
network-diag-ns-fybiu          15657557
openshift                      14746615
openshift-infra                12161229
z2                             6894009
zzhao                          6894009
[root@minion1 subdomain]# oc get pod -n zzhao -o json | grep -i ip
                "hostIP": "10.66.140.17",
                "podIP": "10.128.0.30",
[root@minion1 subdomain]# oc get pod -n z2 -o json | grep -i ip
                "hostIP": "10.66.140.17",
                "podIP": "10.128.0.29",
[root@minion1 subdomain]# oc rsh caddy-docker
/srv # ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
3: eth0@if672: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1450 qdisc noqueue state UP 
    link/ether 3e:7b:e7:d6:b5:8f brd ff:ff:ff:ff:ff:ff
    inet 10.128.0.30/23 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::3c7b:e7ff:fed6:b58f/64 scope link 
       valid_lft forever preferred_lft forever
/srv # ping 10.128.0.29
PING 10.128.0.29 (10.128.0.29): 56 data bytes
^C
--- 10.128.0.29 ping statistics ---
11 packets transmitted, 0 packets received, 100% packet loss
/srv # 



Checking the openflow:

cookie=0x0, duration=783.261s, table=7, n_packets=0, n_bytes=0, priority=100,ip,reg0=0,nw_dst=10.128.0.29 actions=output:30
 cookie=0x0, duration=783.258s, table=7, n_packets=0, n_bytes=0, priority=100,ip,reg0=0xd48d17,nw_dst=10.128.0.29 actions=output:30
 cookie=0x0, duration=770.072s, table=7, n_packets=0, n_bytes=0, priority=100,ip,reg0=0,nw_dst=10.128.0.30 actions=output:31
 cookie=0x0, duration=770.069s, table=7, n_packets=0, n_bytes=0, priority=100,ip,reg0=0x6931b9,nw_dst=10.128.0.30 actions=output:31

Comment 7 zhaozhanqi 2016-11-04 07:04:14 UTC

sorry, forgot to paste the openshift version:

# openshift version
openshift v3.4.0.21+ca4702d
kubernetes v1.4.0+776c994
etcd 3.1.0-rc.0

Comment 8 Ravi Sankar 2016-11-07 20:56:22 UTC

@zhaozhanqi 

I did the same experiment but unable to reproduce the issue on my local environment.
- Created 2 projects, 1 caddy-docker pod on each project and then tested join/isolate network functionality. This worked as expected.

Not sure what has triggered this issue, Do you have reproduction steps?

Comment 9 Meng Bo 2016-11-10 02:01:59 UTC

Sorry for did not see the needinfo, the manage network feature works well on the latest build v3.4.0.23.

Comment 10 Meng Bo 2016-11-10 02:02:24 UTC

Change the bug status to VERIFIED.

Comment 12 errata-xmlrpc 2017-01-18 12:47:02 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:0066

Note You need to log in before you can comment on or make changes to this bug.