Description of problem: 1. after install the ose env, there is a node have network problem, the pod scheduled to this node cannot visit the external network, including the service ip. Version-Release number of selected component (if applicable): openshift v3.1.1.6 kubernetes v1.1.0-origin-1107-g4c8e6f4 etcd 2.1.2 How reproducible: Sometime Steps to Reproduce: 1.install ose env 2.create a project 3.create a pod oc create -f https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/networking/pod-for-ping.json 4. login to the pod ,ping the internal dns and outside dns ip [root@openshift-130 ~]# oc rsh hello-pod bash-4.3# cat /etc/resolv.conf nameserver 172.30.0.1 nameserver 10.66.78.117 search haowang.svc.cluster.local svc.cluster.local cluster.local openstacklocal lab.eng.nay.redhat.com options ndots:5 bash-4.3# ping 172.30.0.1 PING 172.30.0.1 (172.30.0.1): 56 data bytes ^C --- 172.30.0.1 ping statistics --- 4 packets transmitted, 0 packets received, 100% packet loss bash-4.3# ping 10.66.78.117 PING 10.66.78.117 (10.66.78.117): 56 data bytes ^C --- 10.66.78.117 ping statistics --- 2 packets transmitted, 0 packets received, 100% packet loss Actual results: Expected results: Should ping successfull Additional info: Can ping the docker-registry pod in default project successfully
Can you please download and run: https://raw.githubusercontent.com/openshift/openshift-sdn/master/hack/debug.sh That will gather information about your machine to help with debugging.
Thanks for the information and access to the machine. Unfortunately the logs don't go back far enough to see what the error was, and I restarted the atomic-openshift-node service to try to trigger logging... and that fixed it. But I didn't see anything wrong with the ovs rules or the iptables. So, if it happens again, can you please run the above script as soon as possible. Otherwise, I don't know what else to do with this bug. I'm going to leave it open for now, but if you don't see it again I'll close it.
Ben , I cannot reproduce it again, I know restart the node will reconfig the node network, that's why I haven't restart the node service and let you to debug on the server, you can close it now, I will reopen if I met this problem again.
Thanks Wang.