Bug 2057964

Summary: [External Mode] rook-ceph-tools bx is in CCE state due to missing tini
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: Vijay Avuthu <vavuthu>
Component: ocs-operatorAssignee: Subham Rai <srai>
Status: CLOSED NOTABUG QA Contact: Elad <ebenahar>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 4.10CC: madam, muagarwa, ocs-bugs, odf-bz-bot, sostapov, srai
Target Milestone: ---Keywords: Automation, Regression
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-02-24 09:31:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Vijay Avuthu 2022-02-24 08:57:29 UTC
Description of problem (please be detailed as possible and provide log
snippests):

rook-ceph-tools-external pod is in CreateContainerError state

Version of all relevant components (if applicable):

openshift installer (4.10.0-0.nightly-2022-02-23-193238)
ocs-registry:4.10.0-167

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?


Is there any workaround available to the best of your knowledge?
NA

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
1

Can this issue reproducible?
Yes

Can this issue reproduce from the UI?
Not tried

If this is a regression, please provide more details to justify this:
Yes

Steps to Reproduce:
1.install External mode cluster using ocs-ci
2. check ceph health using toolbox
3.


Actual results:

toolbox is in CCE state

Expected results:

toolbox should be in running state


Additional info:

$ oc get pod rook-ceph-tools-external-757d7fcdf9-kt5pz
NAME                                        READY   STATUS                 RESTARTS   AGE
rook-ceph-tools-external-757d7fcdf9-kt5pz   0/1     CreateContainerError   0          47m
$

$ oc describe pod rook-ceph-tools-external-757d7fcdf9-kt5pz
Name:         rook-ceph-tools-external-757d7fcdf9-kt5pz
Namespace:    openshift-storage
Priority:     0
Node:         compute-2/10.1.161.110
Start Time:   Thu, 24 Feb 2022 13:39:43 +0530
Labels:       app=rook-ceph-tools
              pod-template-hash=757d7fcdf9
Annotations:  openshift.io/scc: rook-ceph
Status:       Pending
IP:           10.1.161.110
IPs:
  IP:           10.1.161.110
Controlled By:  ReplicaSet/rook-ceph-tools-external-757d7fcdf9
Containers:
  rook-ceph-tools:
    Container ID:  
    Image:         quay.io/rhceph-dev/odf4-rook-ceph-rhel8-operator@sha256:d630f3015092cca2b2a00e7870dbc6d6307b119657572614b07f2a495fc33780
    Image ID:      
    Port:          <none>
    Host Port:     <none>
    Command:
      /tini
    Args:
      -g
      --
      /usr/local/bin/toolbox.sh
    State:          Waiting
      Reason:       CreateContainerError
    Ready:          False
    Restart Count:  0
    Environment:
      ROOK_CEPH_USERNAME:  <set to the key 'ceph-username' in secret 'rook-ceph-mon'>  Optional: false
      ROOK_CEPH_SECRET:    AQA3mT9hRxdpFxAAzYcsIyOYoLZGI+MGawubCg==
    Mounts:
      /dev from dev (rw)
      /etc/ceph from ceph-config (rw)
      /etc/rook from mon-endpoint-volume (rw)
      /lib/modules from libmodules (rw)
      /sys/bus from sysbus (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-c9mjc (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  dev:
    Type:          HostPath (bare host directory volume)
    Path:          /dev
    HostPathType:  
  sysbus:
    Type:          HostPath (bare host directory volume)
    Path:          /sys/bus
    HostPathType:  
  libmodules:
    Type:          HostPath (bare host directory volume)
    Path:          /lib/modules
    HostPathType:  
  mon-endpoint-volume:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      rook-ceph-mon-endpoints
    Optional:  false
  ceph-config:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  kube-api-access-c9mjc:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
    ConfigMapName:           openshift-service-ca.crt
    ConfigMapOptional:       <nil>
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason     Age                    From               Message
  ----     ------     ----                   ----               -------
  Normal   Scheduled  29m                    default-scheduler  Successfully assigned openshift-storage/rook-ceph-tools-external-757d7fcdf9-kt5pz to compute-2
  Warning  Failed     29m                    kubelet            Error: container create failed: time="2022-02-24T08:09:43Z" level=error msg="runc create failed: unable to start container process: exec: \"/tini\": stat /tini: no such file or directory"
  Warning  Failed     29m                    kubelet            Error: container create failed: time="2022-02-24T08:09:44Z" level=error msg="runc create failed: unable to start container process: exec: \"/tini\": stat /tini: no such file or directory"
  Warning  Failed     29m                    kubelet            Error: container create failed: time="2022-02-24T08:09:58Z" level=error msg="runc create failed: unable to start container process: exec: \"/tini\": stat /tini: no such file or directory"
  Warning  Failed     28m                    kubelet            Error: container create failed: time="2022-02-24T08:10:10Z" level=error msg="runc create failed: unable to start container process: exec: \"/tini\": stat /tini: no such file or directory"
  Warning  Failed     28m                    kubelet            Error: container create failed: time="2022-02-24T08:10:24Z" level=error msg="runc create failed: unable to start container process: exec: \"/tini\": stat /tini: no such file or directory"
  Warning  Failed     28m                    kubelet            Error: container create failed: time="2022-02-24T08:10:37Z" level=error msg="runc create failed: unable to start container process: exec: \"/tini\": stat /tini: no such file or directory"
  Warning  Failed     28m                    kubelet            Error: container create failed: time="2022-02-24T08:10:49Z" level=error msg="runc create failed: unable to start container process: exec: \"/tini\": stat /tini: no such file or directory"
  Warning  Failed     28m                    kubelet            Error: container create failed: time="2022-02-24T08:11:02Z" level=error msg="runc create failed: unable to start container process: exec: \"/tini\": stat /tini: no such file or directory"
  Warning  Failed     27m                    kubelet            Error: container create failed: time="2022-02-24T08:11:16Z" level=error msg="runc create failed: unable to start container process: exec: \"/tini\": stat /tini: no such file or directory"
  Warning  Failed     27m (x3 over 27m)      kubelet            (combined from similar events): Error: container create failed: time="2022-02-24T08:11:53Z" level=error msg="runc create failed: unable to start container process: exec: \"/tini\": stat /tini: no such file or directory"
  Normal   Pulled     4m15s (x117 over 29m)  kubelet            Container image "quay.io/rhceph-dev/odf4-rook-ceph-rhel8-operator@sha256:d630f3015092cca2b2a00e7870dbc6d6307b119657572614b07f2a495fc33780" already present on machine


>job: https://ocs4-jenkins-csb-odf-qe.apps.ocp-c1.prod.psi.redhat.com/job/qe-deploy-ocs-cluster-prod/3319/console

Comment 3 Mudit Agarwal 2022-02-24 09:02:01 UTC
Subham, we need to do the change for external mode also.

Comment 4 Subham Rai 2022-02-24 09:23:49 UTC
Mudit, this requires changes in the file that CI uses to deploy the toolbox. I looked at the deployment it was using the older one. I communicate the same with Vijay in the offline conversation.

Comment 5 Vijay Avuthu 2022-02-24 09:31:20 UTC
After making the chnages in toolbox deployment, its working as expected.

I will makes chnages in CI to reflect the same for 4.10 deployments.

Comment 6 Mudit Agarwal 2022-02-24 10:58:35 UTC
Thanks Subham