Description of problem: DiagnosticPod try to run openshift-diagnostics in ose-deployer containers, which return container_linux.go:296: starting container process caused "exec: \"openshift-diagnostics\": executable file not found in $PATH" Version-Release number of selected component (if applicable): openshift: v3.9.0-0.22.0 How reproducible: always Steps to Reproduce: 1. Install Openshift Client 2. Log in remote Openshift server as an ordinary user 3. run oc adm diagnostics DiagnosticPod --images='registry.example.com/openshift3/ose-${component}:${version}' 4. Check the pod pod-diagnostic-xxx when diagnostics is running Actual results: 3) $ oc adm diagnostics DiagnosticPod --images='registry.example.com/openshift3/ose-${component}:${version}' [Note] Determining if client configuration exists for client/cluster diagnostics Info: Successfully read a client config file at '/home/anli/.kube/config' [Note] Running diagnostic: DiagnosticPod Description: Create a pod to run diagnostics from the application standpoint WARN: [DCli2006 from diagnostic DiagnosticPod@openshift/origin/pkg/oc/admin/diagnostics/diagnostics/client/run_diagnostics_pod.go:157] Timed out preparing diagnostic pod logs for streaming, so this diagnostic cannot run. It is likely that the image 'registry.example.com/openshift3/ose-deployer:v3.9.0-0.22.0' was not pulled and running yet. Last error: (*errors.StatusError[2]) container "pod-diagnostics" in pod "pod-diagnostic-test-267hs" is waiting to start: CreateContainerError [Note] Summary of diagnostics execution (version v3.9.0-0.22.0): [Note] Warnings seen: 1 4)# oc get pods NAME READY STATUS RESTARTS AGE mongodb-1-cs4s7 1/1 Running 0 9h nodejs-mongodb-example-1-build 0/1 Error 0 9h pod-diagnostic-test-267hs 0/1 CreateContainerError 0 18s [root@qe-anli-criomaster-etcd-nfs-1 ~]# oc describe pod pod-diagnostic-test-267hs Name: pod-diagnostic-test-267hs Namespace: install-test Node: qe-anli-crionode-registry-router-1/10.240.0.9 Start Time: Tue, 23 Jan 2018 04:32:16 -0500 Labels: <none> Annotations: openshift.io/scc=restricted Status: Pending IP: 10.129.0.55 Containers: pod-diagnostics: Container ID: Image: registry.reg-aws.openshift.com:443/openshift3/ose-deployer:v3.9.0-0.22.0 Image ID: Port: <none> Command: openshift-diagnostics diagnostic-pod -l 1 State: Waiting Reason: CreateContainerError Ready: False Restart Count: 0 Environment: <none> Mounts: /var/run/secrets/kubernetes.io/serviceaccount from default-token-5j7xs (ro) Conditions: Type Status Initialized True Ready False PodScheduled True Volumes: default-token-5j7xs: Type: Secret (a volume populated by a Secret) SecretName: default-token-5j7xs Optional: false QoS Class: BestEffort Node-Selectors: <none> Tolerations: <none> Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 25s default-scheduler Successfully assigned pod-diagnostic-test-267hs to qe-anli-crionode-registry-router-1 Normal SuccessfulMountVolume 24s kubelet, qe-anli-crionode-registry-router-1 MountVolume.SetUp succeeded for volume "default-token-5j7xs" Normal Pulled 8s (x3 over 24s) kubelet, qe-anli-crionode-registry-router-1 Container image "registry.reg-aws.openshift.com:443/openshift3/ose-deployer:v3.9.0-0.22.0" already present on machine Warning Failed 8s (x3 over 22s) kubelet, qe-anli-crionode-registry-router-1 Error: container create failed: container_linux.go:296: starting container process caused "exec: \"openshift-diagnostics\": executable file not found in $PATH" Expected results: DiagnosticPod should pass DiagnosticPod should report error rather then warining when CreateContainerError failed Additional info:
Thanks for investigating this. I'm going to call this a duplicate. The root problem (missing binary on the image) is being fixed. There's a reasonable argument that the diagnostic should give an error instead of a warning when this happens, but without digging further all it really knows is that the pod is not running, which could happen for a variety of reasons. I don't really want this to diagnose the diagnostic (which really only expands the complexity as there is no end to what can go wrong), but I also don't want to declare that the diagnostic has found an error when really it just failed to run at all. Warning is a compromise. *** This bug has been marked as a duplicate of bug 1534513 ***