Bug 1252386

Summary: [isolation] Cannot reach the kubernetes network from a container when using multi-tenant configuration
Product: OKD Reporter: Meng Bo <bmeng>
Component: NetworkingAssignee: Ravi Sankar <rpenta>
Status: CLOSED CURRENTRELEASE QA Contact: Meng Bo <bmeng>
Severity: high Docs Contact:
Priority: high    
Version: 3.xCC: dcbw, libra-bugs, rpenta
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-09-08 20:14:02 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Meng Bo 2015-08-11 09:49:27 UTC
Description of problem:
Setup origin with redhat/openshift-ovs-multitenant network plugin, create pod and try to get the kubernetes network from inside the container. Eg, master IP. 
There is no route to the master/node IPs from the container.


This will cause the creation of router/registry get failed due to the deployer pod cannot talk to the master via REST API.
[root@master ~]# oc logs router-1-deploy
E0811 09:33:55.395475       1 clientcmd.go:128] Error reading BEARER_TOKEN_FILE "/var/run/secrets/kubernetes.io/serviceaccount/token": open /var/run/secrets/kubernetes.io/serviceaccount/token: permission denied
E0811 09:33:55.447713       1 clientcmd.go:146] Error reading BEARER_TOKEN_FILE "/var/run/secrets/kubernetes.io/serviceaccount/token": open /var/run/secrets/kubernetes.io/serviceaccount/token: permission denied
F0811 09:33:58.455219       1 deployer.go:64] couldn't get deployment default/router-1: Get https://10.66.128.57:8443/api/v1/namespaces/default/replicationcontrollers/router-1: dial tcp 10.66.128.57:8443: no route to host



Version-Release number of selected component (if applicable):
openshift v1.0.4-180-g2f643e3
kubernetes v1.1.0-alpha.0-1155-gb73c53c

How reproducible:
always

Steps to Reproduce:
1. Build the openshift-sdn with multi-tenant branch 
$ git clone https://github.com/openshift/openshift-sdn -b multitenant
$ cd openshift-sdn
$ make clean build && make install
2. Build openshift binary with latest code for both master and node
3. Config the master and node to use redhat/openshift-ovs-multitenant plugin in master-config.conf and node-config.conf
4. Restart docker/openshift services
5. Create pod in the env
6. Get into the container and try to ping the master IP.

Actual results:
[root@rc-test-9g0r2 /]# ping 10.66.128.57
PING 10.66.128.57 (10.66.128.57) 56(84) bytes of data.
From 10.1.2.2 icmp_seq=1 Destination Host Unreachable
From 10.1.2.2 icmp_seq=2 Destination Host Unreachable


Expected results:
Should be able to reach the master from the container.

Additional info:

Comment 1 Ravi Sankar 2015-08-18 01:48:16 UTC
Instead of openshift-sdn multitenant branch, I have tried it on latest openshift/origin repo. openshift-sdn multitenant branch is obsolete now, all the changes related to multitenant are merged into origin repo.

I did *not* see any issues in my testing:
- Pods/containers that are not part of default namespace:
  * Unable to talk to pods of non-default namespaces as expected.
  * Able to talk to pods of default namespaces
  * Able to reach master
- Pods that are part of default namespace are able to reach any pods in the cluster.

Did you switch from openshift-ovs-subnet to openshift-ovs-multitenant network plugin?
If yes, then you have to delete the lbr0 bridge otherwise sdn setup won't be performed on the node.
$ sudo systemctl stop openshift-node
$ sudo ip link set lbr0 down
$ sudo brctl delbr lbr0

Let me know, if you are still noticing this issue.

Comment 2 Meng Bo 2015-08-18 06:15:38 UTC
Thanks Ravi, 

I should have missed the re-create lbr0 step in my previous try.

After I deleted the existing lbr0 and let openshift-sdn to recreate it, the issue cannot be reproduced.

Will update our test scenarios for this.

Thanks,