Bug 1304219

Summary: After install pods created on a node cannot visit the external network through tun0
Product: OpenShift Container Platform Reporter: Wang Haoran <haowang>
Component: NetworkingAssignee: Ben Bennett <bbennett>
Status: CLOSED WORKSFORME QA Contact: Meng Bo <bmeng>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.1.0CC: aos-bugs, bbennett, eparis, haowang
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-02-09 13:33:19 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Wang Haoran 2016-02-03 03:30:36 UTC
Description of problem:
1. after install the ose env, there is a node have network problem, the pod scheduled to this node cannot visit the external network, including the service ip.

Version-Release number of selected component (if applicable):
openshift v3.1.1.6
kubernetes v1.1.0-origin-1107-g4c8e6f4
etcd 2.1.2

How reproducible:
Sometime

Steps to Reproduce:
1.install ose env
2.create a project
   
3.create a pod
  oc create -f https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/networking/pod-for-ping.json
4. login to the pod ,ping the internal dns and outside dns ip

[root@openshift-130 ~]# oc rsh hello-pod 
bash-4.3# cat /etc/resolv.conf 
nameserver 172.30.0.1
nameserver 10.66.78.117
search haowang.svc.cluster.local svc.cluster.local cluster.local openstacklocal lab.eng.nay.redhat.com
options ndots:5
bash-4.3# ping 172.30.0.1
PING 172.30.0.1 (172.30.0.1): 56 data bytes
^C
--- 172.30.0.1 ping statistics ---
4 packets transmitted, 0 packets received, 100% packet loss

bash-4.3# ping 10.66.78.117
PING 10.66.78.117 (10.66.78.117): 56 data bytes
^C
--- 10.66.78.117 ping statistics ---
2 packets transmitted, 0 packets received, 100% packet loss
Actual results:


Expected results:
Should ping successfull

Additional info:
Can ping the docker-registry pod in default project successfully

Comment 2 Ben Bennett 2016-02-03 16:33:29 UTC
Can you please download and run:
  https://raw.githubusercontent.com/openshift/openshift-sdn/master/hack/debug.sh

That will gather information about your machine to help with debugging.

Comment 4 Ben Bennett 2016-02-05 21:21:24 UTC
Thanks for the information and access to the machine.  Unfortunately the logs don't go back far enough to see what the error was, and I restarted the atomic-openshift-node service to try to trigger logging... and that fixed it.

But I didn't see anything wrong with the ovs rules or the iptables.  So, if it happens again, can you please run the above script as soon as possible.  Otherwise, I don't know what else to do with this bug.

I'm going to leave it open for now, but if you don't see it again I'll close it.

Comment 5 Wang Haoran 2016-02-06 01:33:59 UTC
Ben , I cannot reproduce it again, I know restart the node will reconfig the node network, that's why I haven't restart the node service and let you to debug on the server, you can close it now, I will reopen if I met this problem again.

Comment 6 Ben Bennett 2016-02-09 13:33:19 UTC
Thanks Wang.