Bug 1388339

Summary: [networking_public_59]Show some errors in a normal env via 'oadm diagnostics NetworkCheck'
Product: OpenShift Container Platform Reporter: zhaozhanqi <zzhao>
Component: NetworkingAssignee: Ravi Sankar <rpenta>
Status: CLOSED INSUFFICIENT_DATA QA Contact: zhaozhanqi <zzhao>
Severity: medium Docs Contact:
Priority: high    
Version: 3.4.0CC: aos-bugs
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-11-02 06:09:00 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description zhaozhanqi 2016-10-25 07:07:14 UTC
Description of problem:
still show many errors in a normal env(1 master+ 2nodes) when running 'oadm diagnostics NetworkCheck' 

1) 

ERROR: [DSvcNet1005 from diagnostic
CheckServiceNetwork@openshift/origin/pkg/diagnostics/networkpod/service.go:78]
              Getting local and nonlocal pods failed. Error: unable to find local node IP

2) 
[Creating remote tar locally failed: command terminated with exit code 2, tar:
/tmp/openshift/nodes/host-8-174-84.host.centralci.eng.rdu2.redhat.com: Cannot chdir: No such file or directory
       tar: Error is not recoverable: exiting now


Version-Release number of selected component (if applicable):
# openshift version
openshift v3.4.0.15+9c963ec
kubernetes v1.4.0+776c994
etcd 3.1.0-alpha.1


How reproducible:
always

Steps to Reproduce:
1. Setup ocp cluster env with version 3.4.0.15 (1 master + 2 nodes)
2. run 'oadm diagnostics NetworkCheck'
3.

Actual results:

# oadm diagnostics NetworkCheck
[Note] Determining if client configuration exists for client/cluster diagnostics
Info:  Successfully read a client config file at '/root/.kube/config'

[Note] Running diagnostic: NetworkCheck
       Description: Create a pod on all schedulable nodes and run network diagnostics from the application standpoint
       
ERROR: [DNet2008 from diagnostic NetworkCheck@openshift/origin/pkg/diagnostics/network/run_pod.go:128]
       [See the errors below in the output from the network diagnostic pod on node
"host-8-174-84.host.centralci.eng.rdu2.redhat.com":
       [Note] Running diagnostic: CheckExternalNetwork
              Description: Check that external network is accessible within a pod
              
       [Note] Running diagnostic: CheckNodeNetwork
              Description: Check that pods in the cluster can access its own node.
              
       ERROR: [DNodeNet1001 from diagnostic CheckNodeNetwork@openshift/origin/pkg/diagnostics/networkpod/node.go:50]
              unable to find local node IP
              
       [Note] Running diagnostic: CheckPodNetwork
              Description: Check pod to pod communication in the cluster. In case of ovs-subnet network plugin, all pods
should be able to communicate with each other and in case of multitenant network plugin, pods in non-global projects
should be isolated and pods in global projects should be able to access any pod in the cluster and vice versa.
              
       ERROR: [DPodNet1003 from diagnostic CheckPodNetwork@openshift/origin/pkg/diagnostics/networkpod/pod.go:68]
              Getting local and nonlocal pods failed. Error: unable to find local node IP
              
       [Note] Running diagnostic: CheckServiceNetwork
              Description: Check pod to service communication in the cluster. In case of ovs-subnet network plugin, all
pods should be able to communicate with all services and in case of multitenant network plugin, services in non-global
projects should be isolated and pods in global projects should be able to access any service in the cluster.
              
       ERROR: [DSvcNet1005 from diagnostic
CheckServiceNetwork@openshift/origin/pkg/diagnostics/networkpod/service.go:78]
              Getting local and nonlocal pods failed. Error: unable to find local node IP
              
       [Note] Running diagnostic: CollectNetworkInfo
              Description: Collect network information in the cluster.
              
       ERROR: [DColNet1001 from diagnostic CollectNetworkInfo@openshift/origin/pkg/diagnostics/networkpod/collect.go:47]
              Fetching local node info failed: unable to find local node IP
              
       [Note] Summary of diagnostics execution (version v3.4.0.15+9c963ec):
       [Note] Errors seen: 4
       , See the errors below in the output from the network diagnostic pod on node
"host-8-174-38.host.centralci.eng.rdu2.redhat.com":
       [Note] Running diagnostic: CheckExternalNetwork
              Description: Check that external network is accessible within a pod
              
       [Note] Running diagnostic: CheckNodeNetwork
              Description: Check that pods in the cluster can access its own node.
              
       ERROR: [DNodeNet1001 from diagnostic CheckNodeNetwork@openshift/origin/pkg/diagnostics/networkpod/node.go:50]
              unable to find local node IP
              
       [Note] Running diagnostic: CheckPodNetwork
              Description: Check pod to pod communication in the cluster. In case of ovs-subnet network plugin, all pods
should be able to communicate with each other and in case of multitenant network plugin, pods in non-global projects
should be isolated and pods in global projects should be able to access any pod in the cluster and vice versa.
              
       ERROR: [DPodNet1003 from diagnostic CheckPodNetwork@openshift/origin/pkg/diagnostics/networkpod/pod.go:68]
              Getting local and nonlocal pods failed. Error: unable to find local node IP
              
       [Note] Running diagnostic: CheckServiceNetwork
              Description: Check pod to service communication in the cluster. In case of ovs-subnet network plugin, all
pods should be able to communicate with all services and in case of multitenant network plugin, services in non-global
projects should be isolated and pods in global projects should be able to access any service in the cluster.
              
       ERROR: [DSvcNet1005 from diagnostic
CheckServiceNetwork@openshift/origin/pkg/diagnostics/networkpod/service.go:78]
              Getting local and nonlocal pods failed. Error: unable to find local node IP
              
       [Note] Running diagnostic: CollectNetworkInfo
              Description: Collect network information in the cluster.
              
       ERROR: [DColNet1001 from diagnostic CollectNetworkInfo@openshift/origin/pkg/diagnostics/networkpod/collect.go:47]
              Fetching local node info failed: unable to find local node IP
              
       [Note] Summary of diagnostics execution (version v3.4.0.15+9c963ec):
       [Note] Errors seen: 4
       ]
       
ERROR: [DNet2011 from diagnostic NetworkCheck@openshift/origin/pkg/diagnostics/network/run_pod.go:146]
       [Creating remote tar locally failed: command terminated with exit code 2, tar:
/tmp/openshift/nodes/host-8-174-84.host.centralci.eng.rdu2.redhat.com: Cannot chdir: No such file or directory
       tar: Error is not recoverable: exiting now
       , Creating remote tar locally failed: command terminated with exit code 2, tar:
/tmp/openshift/nodes/host-8-174-38.host.centralci.eng.rdu2.redhat.com: Cannot chdir: No such file or directory
       tar: Error is not recoverable: exiting now
       ]
       
Info:  Additional info collected under "/tmp/openshift" for further analysis
[Note] Summary of diagnostics execution (version v3.4.0.15+9c963ec):
[Note] Errors seen: 2

Expected results:

no errors in a normal env.

Additional info:

Comment 1 Ravi Sankar 2016-11-02 02:32:35 UTC
Unable to reproduce on local dind cluster with 1 master and 2 nodes.
Can you elaborate the reproduction steps.

Comment 2 zhaozhanqi 2016-11-02 06:09:00 UTC
I still not met this issue on the latest OCP env. 

verified this bug for now. will reopen if it still can be reproduced.