Bug 2179976

Summary: [ODF 4.13] Missing the status-reporter binary causing pods "report-status-to-provider" remain in CreateContainerError on ODF to ODF cluster on ROSA
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: Jilju Joy <jijoy>
Component: buildAssignee: Boris Ranto <branto>
Status: CLOSED ERRATA QA Contact: Jilju Joy <jijoy>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.13CC: branto, muagarwa, nberry, nigoyal, ocs-bugs, odf-bz-bot
Target Milestone: ---Keywords: Regression
Target Release: ODF 4.13.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of:
: 2179978 (view as bug list) Environment:
Last Closed: 2023-06-21 15:24:39 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2179978    

Description Jilju Joy 2023-03-20 13:14:42 UTC
Description of problem (please be detailed as possible and provide log
snippests):
In ODF to ODF cluster on ROSA, there are many pods "report-status-to-provider" in the status "CreateContainerError" on the consumer cluster.

Error is:

Normal   Pulled          20m                 kubelet            Successfully pulled image "quay.io/rhceph-dev/odf4-ocs-rhel9-operator@sha256:3bef03dfa68d265327cf8f9e74a5fc76b3993027567bc558f4a3c014bca866cc" in 10.363646273s
  Warning  Failed          20m                 kubelet            Error: container create failed: time="2023-03-20T12:15:12Z" level=error msg="runc create failed: unable to start container process: exec: \"/usr/local/bin/status-reporter\": stat /usr/local/bin/status-reporter: no such file or directory"
  Warning  Failed          20m                 kubelet            Error: container create failed: time="2023-03-20T12:15:13Z" level=error msg="runc create failed: unable to start container process: exec: \"/usr/local/bin/status-reporter\": stat /usr/local/bin/status-reporter: no such file or directory"
===================================================================

Version of all relevant components (if applicable):
odf-operator.v4.13.0-107.stable
$ oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.11.30   True        False         173m    Cluster version is 4.11.30
====================================================================

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
Yes, this impacts the connection between provider and consumer cluster

Is there any workaround available to the best of your knowledge?


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
1

Can this issue reproducible?
Reporting the first occurance

Can this issue reproduce from the UI?


If this is a regression, please provide more details to justify this:
Issue is not seen in ODF 4.10(tested with deployer) on ROSA

Steps to Reproduce:
1. Install ODF to ODF cluster configuratuion on ROSA
2. Check the status of pods "report-status-to-provider" (if present).

Actual results:
Pods "report-status-to-provider" in CreateContainerError state

Expected results:
Pods "report-status-to-provider" (if present) should be in completed or running state on consumer cluster.
=================================================
Additional info:

Used "odf-storage" as the namespace instead of "openshift-storage"

Comment 2 Jilju Joy 2023-03-24 06:02:48 UTC
Didn't face the issue with odf-operator.v4.13.0-109.stable build.
Tested in the same configuration given in comment #0.

output from consumer:

$ oc get pods -o wide
NAME                                               READY   STATUS        RESTARTS   AGE     IP             NODE                                        NOMINATED NODE   READINESS GATES
csi-addons-controller-manager-589f665dc4-phskb     2/2     Running       0          3h8m    10.131.0.8     ip-10-0-17-113.us-east-2.compute.internal   <none>           <none>
csi-cephfsplugin-96nzz                             2/2     Running       0          137m    10.0.13.90     ip-10-0-13-90.us-east-2.compute.internal    <none>           <none>
csi-cephfsplugin-c6kx7                             2/2     Running       0          137m    10.0.17.113    ip-10-0-17-113.us-east-2.compute.internal   <none>           <none>
csi-cephfsplugin-provisioner-97d6d5556-jqd5m       5/5     Running       0          137m    10.128.2.17    ip-10-0-13-90.us-east-2.compute.internal    <none>           <none>
csi-cephfsplugin-provisioner-97d6d5556-snnph       5/5     Running       0          137m    10.131.0.14    ip-10-0-17-113.us-east-2.compute.internal   <none>           <none>
csi-cephfsplugin-zw99s                             2/2     Running       0          137m    10.0.20.236    ip-10-0-20-236.us-east-2.compute.internal   <none>           <none>
csi-rbdplugin-89rbb                                3/3     Running       0          137m    10.0.17.113    ip-10-0-17-113.us-east-2.compute.internal   <none>           <none>
csi-rbdplugin-provisioner-56c9db9899-j4xzd         6/6     Running       0          137m    10.129.2.27    ip-10-0-20-236.us-east-2.compute.internal   <none>           <none>
csi-rbdplugin-provisioner-56c9db9899-l56z7         6/6     Running       0          137m    10.131.0.12    ip-10-0-17-113.us-east-2.compute.internal   <none>           <none>
csi-rbdplugin-t9j7f                                3/3     Running       0          137m    10.0.20.236    ip-10-0-20-236.us-east-2.compute.internal   <none>           <none>
csi-rbdplugin-xxvdz                                3/3     Running       0          137m    10.0.13.90     ip-10-0-13-90.us-east-2.compute.internal    <none>           <none>
noobaa-operator-58bfbddd48-ctcx2                   1/1     Running       0          3h9m    10.129.2.19    ip-10-0-20-236.us-east-2.compute.internal   <none>           <none>
ocs-metrics-exporter-55f8579b7b-jlkhx              1/1     Running       0          3h8m    10.129.2.23    ip-10-0-20-236.us-east-2.compute.internal   <none>           <none>
ocs-operator-844bd4b4b4-z64bz                      1/1     Running       0          3h8m    10.131.0.6     ip-10-0-17-113.us-east-2.compute.internal   <none>           <none>
odf-console-79f8969d6-z59l9                        1/1     Running       0          3h8m    10.131.0.5     ip-10-0-17-113.us-east-2.compute.internal   <none>           <none>
odf-operator-controller-manager-68f5bc664d-zb46j   2/2     Running       0          3h8m    10.129.2.21    ip-10-0-20-236.us-east-2.compute.internal   <none>           <none>
report-status-to-provider-27992900-5txjv           0/1     Completed     0          2m48s   10.129.2.173   ip-10-0-20-236.us-east-2.compute.internal   <none>           <none>
report-status-to-provider-27992901-dk6w6           0/1     Completed     0          108s    10.129.2.174   ip-10-0-20-236.us-east-2.compute.internal   <none>           <none>
report-status-to-provider-27992902-4r7hg           0/1     Completed     0          48s     10.129.2.175   ip-10-0-20-236.us-east-2.compute.internal   <none>           <none>
rook-ceph-operator-776674fd6-rnxbr                 1/1     Running       0          137m    10.129.2.26    ip-10-0-20-236.us-east-2.compute.internal   <none>           <none>
rook-ceph-tools-555997499-kvl92                    1/1     Running       0          11s     10.129.2.176   ip-10-0-20-236.us-east-2.compute.internal   <none>           <none>

report-status-to-provider is in Completed state. This is expected [Expected state is Completed or Running if the pod report-status-to-provider is present].

OCP version is 4.11.31.

Comment 13 errata-xmlrpc 2023-06-21 15:24:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenShift Data Foundation 4.13.0 enhancement and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:3742