Description of problem (please be detailed as possible and provide log snippests): rook-ceph-tools-external pod is in CreateContainerError state Version of all relevant components (if applicable): openshift installer (4.10.0-0.nightly-2022-02-23-193238) ocs-registry:4.10.0-167 Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? Is there any workaround available to the best of your knowledge? NA Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? 1 Can this issue reproducible? Yes Can this issue reproduce from the UI? Not tried If this is a regression, please provide more details to justify this: Yes Steps to Reproduce: 1.install External mode cluster using ocs-ci 2. check ceph health using toolbox 3. Actual results: toolbox is in CCE state Expected results: toolbox should be in running state Additional info: $ oc get pod rook-ceph-tools-external-757d7fcdf9-kt5pz NAME READY STATUS RESTARTS AGE rook-ceph-tools-external-757d7fcdf9-kt5pz 0/1 CreateContainerError 0 47m $ $ oc describe pod rook-ceph-tools-external-757d7fcdf9-kt5pz Name: rook-ceph-tools-external-757d7fcdf9-kt5pz Namespace: openshift-storage Priority: 0 Node: compute-2/10.1.161.110 Start Time: Thu, 24 Feb 2022 13:39:43 +0530 Labels: app=rook-ceph-tools pod-template-hash=757d7fcdf9 Annotations: openshift.io/scc: rook-ceph Status: Pending IP: 10.1.161.110 IPs: IP: 10.1.161.110 Controlled By: ReplicaSet/rook-ceph-tools-external-757d7fcdf9 Containers: rook-ceph-tools: Container ID: Image: quay.io/rhceph-dev/odf4-rook-ceph-rhel8-operator@sha256:d630f3015092cca2b2a00e7870dbc6d6307b119657572614b07f2a495fc33780 Image ID: Port: <none> Host Port: <none> Command: /tini Args: -g -- /usr/local/bin/toolbox.sh State: Waiting Reason: CreateContainerError Ready: False Restart Count: 0 Environment: ROOK_CEPH_USERNAME: <set to the key 'ceph-username' in secret 'rook-ceph-mon'> Optional: false ROOK_CEPH_SECRET: AQA3mT9hRxdpFxAAzYcsIyOYoLZGI+MGawubCg== Mounts: /dev from dev (rw) /etc/ceph from ceph-config (rw) /etc/rook from mon-endpoint-volume (rw) /lib/modules from libmodules (rw) /sys/bus from sysbus (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-c9mjc (ro) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: dev: Type: HostPath (bare host directory volume) Path: /dev HostPathType: sysbus: Type: HostPath (bare host directory volume) Path: /sys/bus HostPathType: libmodules: Type: HostPath (bare host directory volume) Path: /lib/modules HostPathType: mon-endpoint-volume: Type: ConfigMap (a volume populated by a ConfigMap) Name: rook-ceph-mon-endpoints Optional: false ceph-config: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium: SizeLimit: <unset> kube-api-access-c9mjc: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: <nil> DownwardAPI: true ConfigMapName: openshift-service-ca.crt ConfigMapOptional: <nil> QoS Class: BestEffort Node-Selectors: <none> Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 29m default-scheduler Successfully assigned openshift-storage/rook-ceph-tools-external-757d7fcdf9-kt5pz to compute-2 Warning Failed 29m kubelet Error: container create failed: time="2022-02-24T08:09:43Z" level=error msg="runc create failed: unable to start container process: exec: \"/tini\": stat /tini: no such file or directory" Warning Failed 29m kubelet Error: container create failed: time="2022-02-24T08:09:44Z" level=error msg="runc create failed: unable to start container process: exec: \"/tini\": stat /tini: no such file or directory" Warning Failed 29m kubelet Error: container create failed: time="2022-02-24T08:09:58Z" level=error msg="runc create failed: unable to start container process: exec: \"/tini\": stat /tini: no such file or directory" Warning Failed 28m kubelet Error: container create failed: time="2022-02-24T08:10:10Z" level=error msg="runc create failed: unable to start container process: exec: \"/tini\": stat /tini: no such file or directory" Warning Failed 28m kubelet Error: container create failed: time="2022-02-24T08:10:24Z" level=error msg="runc create failed: unable to start container process: exec: \"/tini\": stat /tini: no such file or directory" Warning Failed 28m kubelet Error: container create failed: time="2022-02-24T08:10:37Z" level=error msg="runc create failed: unable to start container process: exec: \"/tini\": stat /tini: no such file or directory" Warning Failed 28m kubelet Error: container create failed: time="2022-02-24T08:10:49Z" level=error msg="runc create failed: unable to start container process: exec: \"/tini\": stat /tini: no such file or directory" Warning Failed 28m kubelet Error: container create failed: time="2022-02-24T08:11:02Z" level=error msg="runc create failed: unable to start container process: exec: \"/tini\": stat /tini: no such file or directory" Warning Failed 27m kubelet Error: container create failed: time="2022-02-24T08:11:16Z" level=error msg="runc create failed: unable to start container process: exec: \"/tini\": stat /tini: no such file or directory" Warning Failed 27m (x3 over 27m) kubelet (combined from similar events): Error: container create failed: time="2022-02-24T08:11:53Z" level=error msg="runc create failed: unable to start container process: exec: \"/tini\": stat /tini: no such file or directory" Normal Pulled 4m15s (x117 over 29m) kubelet Container image "quay.io/rhceph-dev/odf4-rook-ceph-rhel8-operator@sha256:d630f3015092cca2b2a00e7870dbc6d6307b119657572614b07f2a495fc33780" already present on machine >job: https://ocs4-jenkins-csb-odf-qe.apps.ocp-c1.prod.psi.redhat.com/job/qe-deploy-ocs-cluster-prod/3319/console
Subham, we need to do the change for external mode also.
Mudit, this requires changes in the file that CI uses to deploy the toolbox. I looked at the deployment it was using the older one. I communicate the same with Vijay in the offline conversation.
After making the chnages in toolbox deployment, its working as expected. I will makes chnages in CI to reflect the same for 4.10 deployments.
Thanks Subham