Bug 2179978

Summary: [ODF 4.12] Missing the status-reporter binary causing pods "report-status-to-provider" remain in CreateContainerError on ODF to ODF cluster on ROSA
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: Jilju Joy <jijoy>
Component: buildAssignee: Boris Ranto <branto>
Status: CLOSED ERRATA QA Contact: Jilju Joy <jijoy>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.12CC: jopinto, muagarwa, nberry, nigoyal, ocs-bugs, odf-bz-bot, pbalogh, sheggodu, tmuthami
Target Milestone: ---Keywords: Regression
Target Release: ODF 4.12.2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 2179976 Environment:
Last Closed: 2023-04-17 22:34:07 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2179976    
Bug Blocks:    

Description Jilju Joy 2023-03-20 13:17:58 UTC
+++ This bug was initially created as a clone of Bug #2179976 +++

Description of problem (please be detailed as possible and provide log
snippests):
In ODF to ODF cluster on ROSA, there are many pods "report-status-to-provider" in the status "CreateContainerError" on the consumer cluster.

Error is:

Normal   Pulled          20m                 kubelet            Successfully pulled image "quay.io/rhceph-dev/odf4-ocs-rhel9-operator@sha256:3bef03dfa68d265327cf8f9e74a5fc76b3993027567bc558f4a3c014bca866cc" in 10.363646273s
  Warning  Failed          20m                 kubelet            Error: container create failed: time="2023-03-20T12:15:12Z" level=error msg="runc create failed: unable to start container process: exec: \"/usr/local/bin/status-reporter\": stat /usr/local/bin/status-reporter: no such file or directory"
  Warning  Failed          20m                 kubelet            Error: container create failed: time="2023-03-20T12:15:13Z" level=error msg="runc create failed: unable to start container process: exec: \"/usr/local/bin/status-reporter\": stat /usr/local/bin/status-reporter: no such file or directory"
===================================================================

Version of all relevant components (if applicable):
odf-operator.v4.13.0-107.stable
$ oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.11.30   True        False         173m    Cluster version is 4.11.30
====================================================================

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
Yes, this impacts the connection between provider and consumer cluster

Is there any workaround available to the best of your knowledge?


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
1

Can this issue reproducible?
Reporting the first occurance

Can this issue reproduce from the UI?


If this is a regression, please provide more details to justify this:
Issue is not seen in ODF 4.10(tested with deployer) on ROSA

Steps to Reproduce:
1. Install ODF to ODF cluster configuratuion on ROSA
2. Check the status of pods "report-status-to-provider" (if present).

Actual results:
Pods "report-status-to-provider" in CreateContainerError state

Expected results:
Pods "report-status-to-provider" (if present) should be in completed or running state on consumer cluster.
=================================================
Additional info:

Used "odf-storage" as the namespace instead of "openshift-storage"

Comment 8 Joy John Pinto 2023-03-30 13:13:21 UTC
Verified with OCP 4.11.0-0.nightly-2023-03-28-010148 ODF 4.12.2-2


output from consumer:

$ oc get pods -o wide
NAME                                               READY   STATUS        RESTARTS   AGE     IP             NODE                                        NOMINATED NODE   READINESS GATES
csi-addons-controller-manager-589f665dc4-phskb     2/2     Running       0          3h8m    10.131.0.8     ip-10-0-17-113.us-east-2.compute.internal   <none>           <none>
csi-cephfsplugin-96nzz                             2/2     Running       0          137m    10.0.13.90     ip-10-0-13-90.us-east-2.compute.internal    <none>           <none>
csi-cephfsplugin-c6kx7                             2/2     Running       0          137m    10.0.17.113    ip-10-0-17-113.us-east-2.compute.internal   <none>           <none>
csi-cephfsplugin-provisioner-97d6d5556-jqd5m       5/5     Running       0          137m    10.128.2.17    ip-10-0-13-90.us-east-2.compute.internal    <none>           <none>
csi-cephfsplugin-provisioner-97d6d5556-snnph       5/5     Running       0          137m    10.131.0.14    ip-10-0-17-113.us-east-2.compute.internal   <none>           <none>
csi-cephfsplugin-zw99s                             2/2     Running       0          137m    10.0.20.236    ip-10-0-20-236.us-east-2.compute.internal   <none>           <none>
csi-rbdplugin-89rbb                                3/3     Running       0          137m    10.0.17.113    ip-10-0-17-113.us-east-2.compute.internal   <none>           <none>
csi-rbdplugin-provisioner-56c9db9899-j4xzd         6/6     Running       0          137m    10.129.2.27    ip-10-0-20-236.us-east-2.compute.internal   <none>           <none>
csi-rbdplugin-provisioner-56c9db9899-l56z7         6/6     Running       0          137m    10.131.0.12    ip-10-0-17-113.us-east-2.compute.internal   <none>           <none>
csi-rbdplugin-t9j7f                                3/3     Running       0          137m    10.0.20.236    ip-10-0-20-236.us-east-2.compute.internal   <none>           <none>
csi-rbdplugin-xxvdz                                3/3     Running       0          137m    10.0.13.90     ip-10-0-13-90.us-east-2.compute.internal    <none>           <none>
noobaa-operator-58bfbddd48-ctcx2                   1/1     Running       0          3h9m    10.129.2.19    ip-10-0-20-236.us-east-2.compute.internal   <none>           <none>
ocs-metrics-exporter-55f8579b7b-jlkhx              1/1     Running       0          3h8m    10.129.2.23    ip-10-0-20-236.us-east-2.compute.internal   <none>           <none>
ocs-operator-844bd4b4b4-z64bz                      1/1     Running       0          3h8m    10.131.0.6     ip-10-0-17-113.us-east-2.compute.internal   <none>           <none>
odf-console-79f8969d6-z59l9                        1/1     Running       0          3h8m    10.131.0.5     ip-10-0-17-113.us-east-2.compute.internal   <none>           <none>
odf-operator-controller-manager-68f5bc664d-zb46j   2/2     Running       0          3h8m    10.129.2.21    ip-10-0-20-236.us-east-2.compute.internal   <none>           <none>
report-status-to-provider-27992900-5txjv           0/1     Completed     0          2m48s   10.129.2.173   ip-10-0-20-236.us-east-2.compute.internal   <none>           <none>
report-status-to-provider-27992901-dk6w6           0/1     Completed     0          108s    10.129.2.174   ip-10-0-20-236.us-east-2.compute.internal   <none>           <none>
report-status-to-provider-27992902-4r7hg           0/1     Completed     0          48s     10.129.2.175   ip-10-0-20-236.us-east-2.compute.internal   <none>           <none>
rook-ceph-operator-776674fd6-rnxbr                 1/1     Running       0          137m    10.129.2.26    ip-10-0-20-236.us-east-2.compute.internal   <none>           <none>
rook-ceph-tools-555997499-kvl92                    1/1     Running       0          11s     10.129.2.176   ip-10-0-20-236.us-east-2.compute.internal   <none>           <none>

report-status-to-provider is in Completed state which is as expected.  Hence closing the bug

Comment 13 errata-xmlrpc 2023-04-17 22:34:07 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat OpenShift Data Foundation 4.12.2 Bug Fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:1816