Bug 1388059

Summary: [networking_public_59]Unexpected error shown using 'oadm diagnostics NetworkCheck' when there are 2 pods in rc
Product: OKD Reporter: zhaozhanqi <zzhao>
Component: NetworkingAssignee: Ravi Sankar <rpenta>
Status: CLOSED CURRENTRELEASE QA Contact: zhaozhanqi <zzhao>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.xCC: aos-bugs
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-12-09 21:50:09 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description zhaozhanqi 2016-10-24 10:45:18 UTC
Description of problem:
There is a strange thing, when there is rc with 2 pods created. using 'oadm diagnostics NetworkCheck' show error"
ERROR: [DNet2008 from diagnostic NetworkCheck@openshift/origin/pkg/diagnostics/network/run_pod.go:128]
       Logs for network diagnostic pod on node "ip-172-18-15-181.ec2.internal" failed: container
"network-diag-pod-7azq1" in pod "network-diag-pod-7azq1" is not available

Version-Release number of selected component (if applicable):
#openshift version
openshift v1.4.0-alpha.0+0787d9f-738
kubernetes v1.4.0+776c994
etcd 3.1.0-alpha.1

network plugin:redhat/openshift-ovs-subnet

How reproducible:
always

Steps to Reproduce:
1. setup openshift cluster (1 master +2 nodes) with subnet plugin 
2. Create rc using userA
   oc create -f  https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/routing/list_for_pods.json
3. run 'oadm diagnostics NetworkCheck'

Actual results:

# oadm diagnostics NetworkCheck --network-logdir='/tmp/test2'
[Note] Determining if client configuration exists for client/cluster diagnostics
Info:  Successfully read a client config file at '/root/.kube/config'

[Note] Running diagnostic: NetworkCheck
       Description: Create a pod on all schedulable nodes and run network diagnostics from the application standpoint
       
Info:  Output from the network diagnostic pod on node "ip-172-18-15-180.ec2.internal":
       [Note] Running diagnostic: CheckExternalNetwork
              Description: Check that external network is accessible within a pod
              
       [Note] Running diagnostic: CheckNodeNetwork
              Description: Check that pods in the cluster can access its own node.
              
       [Note] Running diagnostic: CheckPodNetwork
              Description: Check pod to pod communication in the cluster. In case of ovs-subnet network plugin, all pods
should be able to communicate with each other and in case of multitenant network plugin, pods in non-global projects
should be isolated and pods in global projects should be able to access any pod in the cluster and vice versa.
              
       [Note] Running diagnostic: CheckServiceNetwork
              Description: Check pod to service communication in the cluster. In case of ovs-subnet network plugin, all
pods should be able to communicate with all services and in case of multitenant network plugin, services in non-global
projects should be isolated and pods in global projects should be able to access any service in the cluster.
              
       [Note] Running diagnostic: CollectNetworkInfo
              Description: Collect network information in the cluster.
              
       [Note] Summary of diagnostics execution (version v1.4.0-alpha.0+0787d9f-738):
       [Note] Completed with no errors or warnings seen.
       
ERROR: [DNet2008 from diagnostic NetworkCheck@openshift/origin/pkg/diagnostics/network/run_pod.go:128]
       Logs for network diagnostic pod on node "ip-172-18-15-181.ec2.internal" failed: container
"network-diag-pod-7azq1" in pod "network-diag-pod-7azq1" is not available
       
Info:  Additional info collected under "/tmp/test2" for further analysis
[Note] Summary of diagnostics execution (version v1.4.0-alpha.0+0787d9f-738):
[Note] Errors seen: 1


Expected results:

no this error

Additional info:

Comment 1 Ravi Sankar 2016-11-02 02:32:19 UTC
Unable to reproduce on local dind cluster with 1 master and 2 nodes.
Can you elaborate the reproduction steps.

Comment 2 zhaozhanqi 2016-11-02 06:04:54 UTC
hmm the issue also cannot reproduced in the latest OCP env. maybe it was caused by the CNI plugin last days. 

verified this bug for now.  will reopen it when this issue still reproduced again.