Problem description: Run "oadm diagnostics --images='openshift3/ose-${component}:${version}' " when router pod is not running, the checks inside diag container can pass, without reporting the missing of router pod Version-Release number of selected component (if applicable): openshift v3.2.0.8 kubernetes v1.2.0-36-g4a3f9c5 etcd 2.2.5 How reproducible: Always Steps to Reproduce: 1. Login openshift master, go to default project 2. Scale down router pod to 0 : # oc scale rc router-1 --replicas=0 3. Make sure router pod is not running, only the registry pod is running: # oc get po NAME READY STATUS RESTARTS AGE docker-registry-1-6mboi 1/1 Running 0 45s 4. Diagnostics openshift by running the diag container: oadm diagnostics --images='openshift3/ose-${component}:${version}' Actual Result: Diag container passed: [Note] Running diagnostic: DiagnosticPod Description: Create a pod to run diagnostics from the application standpoint Info: Output from the diagnostic pod (image openshift3/ose-deployer:v3.2.0.8): [Note] Running diagnostic: PodCheckAuth Description: Check that service account credentials authenticate as expected Info: Service account token successfully authenticated to master Info: Service account token was authenticated by the integrated registry. [Note] Running diagnostic: PodCheckDns Description: Check that DNS within a pod works as expected [Note] Summary of diagnostics execution (version v3.2.0.8): [Note] Completed with no errors or warnings seen. Expected Result: Diag container should fail, and should report the missing of router pod Additional info: The error message from diag container when registry pod is missing: ERROR: [DP1014 from diagnostic PodCheckAuth@openshift/origin/pkg/diagnostics/pod/auth.go:173] Request to integrated registry timed out; this typically indicates network or SDN problems.
This works for me: [Note] Running diagnostic: ClusterRouterName Description: Check there is a working router ERROR: [DClu2007 from diagnostic ClusterRouter@openshift/origin/pkg/diagnostics/cluster/router.go:156] The "router" DeploymentConfig exists but has no running pods, so it is not available. Apps will not be externally accessible via the router. Can you paste or attach the entire output from your oadm diagnostics run? openshift v3.2.0.8 kubernetes v1.2.0-36-g4a3f9c5 etcd 2.2.5
Right, that particular diagnostic doesn't test for the router, because the router isn't very interesting from the perspective of inside a pod. It tests for the presence of the registry because build pods need to interact with the registry all the time and we would like to know if/why that is not working. ClusterRouterName is the correct diagnostic for looking for a router.
Created attachment 1142057 [details] full diagnostics log attached
@lmeyer yes, I verified that ClusterRouterName managed in reporting the missing about router pod. I now understand that diag container is not aimed at reporting this.Would you please help set the bug status to ON_QA, and I will then close it? Thanks. Thanks, Xia
Set to verified according to my comment in #4
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2016:1064