2179976 – [ODF 4.13] Missing the status-reporter binary causing pods "report-status-to-provider" remain in CreateContainerError on ODF to ODF cluster on ROSA

Bug 2179976 - [ODF 4.13] Missing the status-reporter binary causing pods "report-status-to-provider" remain in CreateContainerError on ODF to ODF cluster on ROSA

Summary: [ODF 4.13] Missing the status-reporter binary causing pods "report-status-to-...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat OpenShift Data Foundation
Classification:	Red Hat Storage
Component:	build
Sub Component:
Version:	4.13
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	ODF 4.13.0
Assignee:	Boris Ranto
QA Contact:	Jilju Joy
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	2179978
TreeView+	depends on / blocked

Reported:	2023-03-20 13:14 UTC by Jilju Joy
Modified:	2023-08-09 16:37 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Clones:	2179978 (view as bug list)
Environment:
Last Closed:	2023-06-21 15:24:39 UTC
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2023:3742	0	None	None	None	2023-06-21 15:24:52 UTC

Description Jilju Joy 2023-03-20 13:14:42 UTC

Description of problem (please be detailed as possible and provide log
snippests):
In ODF to ODF cluster on ROSA, there are many pods "report-status-to-provider" in the status "CreateContainerError" on the consumer cluster.

Error is:

Normal   Pulled          20m                 kubelet            Successfully pulled image "quay.io/rhceph-dev/odf4-ocs-rhel9-operator@sha256:3bef03dfa68d265327cf8f9e74a5fc76b3993027567bc558f4a3c014bca866cc" in 10.363646273s
  Warning  Failed          20m                 kubelet            Error: container create failed: time="2023-03-20T12:15:12Z" level=error msg="runc create failed: unable to start container process: exec: \"/usr/local/bin/status-reporter\": stat /usr/local/bin/status-reporter: no such file or directory"
  Warning  Failed          20m                 kubelet            Error: container create failed: time="2023-03-20T12:15:13Z" level=error msg="runc create failed: unable to start container process: exec: \"/usr/local/bin/status-reporter\": stat /usr/local/bin/status-reporter: no such file or directory"
===================================================================

Version of all relevant components (if applicable):
odf-operator.v4.13.0-107.stable
$ oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.11.30   True        False         173m    Cluster version is 4.11.30
====================================================================

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
Yes, this impacts the connection between provider and consumer cluster

Is there any workaround available to the best of your knowledge?


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
1

Can this issue reproducible?
Reporting the first occurance

Can this issue reproduce from the UI?


If this is a regression, please provide more details to justify this:
Issue is not seen in ODF 4.10(tested with deployer) on ROSA

Steps to Reproduce:
1. Install ODF to ODF cluster configuratuion on ROSA
2. Check the status of pods "report-status-to-provider" (if present).

Actual results:
Pods "report-status-to-provider" in CreateContainerError state

Expected results:
Pods "report-status-to-provider" (if present) should be in completed or running state on consumer cluster.
=================================================
Additional info:

Used "odf-storage" as the namespace instead of "openshift-storage"

Comment 2 Jilju Joy 2023-03-24 06:02:48 UTC

Didn't face the issue with odf-operator.v4.13.0-109.stable build.
Tested in the same configuration given in comment #0.

output from consumer:

$ oc get pods -o wide
NAME                                               READY   STATUS        RESTARTS   AGE     IP             NODE                                        NOMINATED NODE   READINESS GATES
csi-addons-controller-manager-589f665dc4-phskb     2/2     Running       0          3h8m    10.131.0.8     ip-10-0-17-113.us-east-2.compute.internal   <none>           <none>
csi-cephfsplugin-96nzz                             2/2     Running       0          137m    10.0.13.90     ip-10-0-13-90.us-east-2.compute.internal    <none>           <none>
csi-cephfsplugin-c6kx7                             2/2     Running       0          137m    10.0.17.113    ip-10-0-17-113.us-east-2.compute.internal   <none>           <none>
csi-cephfsplugin-provisioner-97d6d5556-jqd5m       5/5     Running       0          137m    10.128.2.17    ip-10-0-13-90.us-east-2.compute.internal    <none>           <none>
csi-cephfsplugin-provisioner-97d6d5556-snnph       5/5     Running       0          137m    10.131.0.14    ip-10-0-17-113.us-east-2.compute.internal   <none>           <none>
csi-cephfsplugin-zw99s                             2/2     Running       0          137m    10.0.20.236    ip-10-0-20-236.us-east-2.compute.internal   <none>           <none>
csi-rbdplugin-89rbb                                3/3     Running       0          137m    10.0.17.113    ip-10-0-17-113.us-east-2.compute.internal   <none>           <none>
csi-rbdplugin-provisioner-56c9db9899-j4xzd         6/6     Running       0          137m    10.129.2.27    ip-10-0-20-236.us-east-2.compute.internal   <none>           <none>
csi-rbdplugin-provisioner-56c9db9899-l56z7         6/6     Running       0          137m    10.131.0.12    ip-10-0-17-113.us-east-2.compute.internal   <none>           <none>
csi-rbdplugin-t9j7f                                3/3     Running       0          137m    10.0.20.236    ip-10-0-20-236.us-east-2.compute.internal   <none>           <none>
csi-rbdplugin-xxvdz                                3/3     Running       0          137m    10.0.13.90     ip-10-0-13-90.us-east-2.compute.internal    <none>           <none>
noobaa-operator-58bfbddd48-ctcx2                   1/1     Running       0          3h9m    10.129.2.19    ip-10-0-20-236.us-east-2.compute.internal   <none>           <none>
ocs-metrics-exporter-55f8579b7b-jlkhx              1/1     Running       0          3h8m    10.129.2.23    ip-10-0-20-236.us-east-2.compute.internal   <none>           <none>
ocs-operator-844bd4b4b4-z64bz                      1/1     Running       0          3h8m    10.131.0.6     ip-10-0-17-113.us-east-2.compute.internal   <none>           <none>
odf-console-79f8969d6-z59l9                        1/1     Running       0          3h8m    10.131.0.5     ip-10-0-17-113.us-east-2.compute.internal   <none>           <none>
odf-operator-controller-manager-68f5bc664d-zb46j   2/2     Running       0          3h8m    10.129.2.21    ip-10-0-20-236.us-east-2.compute.internal   <none>           <none>
report-status-to-provider-27992900-5txjv           0/1     Completed     0          2m48s   10.129.2.173   ip-10-0-20-236.us-east-2.compute.internal   <none>           <none>
report-status-to-provider-27992901-dk6w6           0/1     Completed     0          108s    10.129.2.174   ip-10-0-20-236.us-east-2.compute.internal   <none>           <none>
report-status-to-provider-27992902-4r7hg           0/1     Completed     0          48s     10.129.2.175   ip-10-0-20-236.us-east-2.compute.internal   <none>           <none>
rook-ceph-operator-776674fd6-rnxbr                 1/1     Running       0          137m    10.129.2.26    ip-10-0-20-236.us-east-2.compute.internal   <none>           <none>
rook-ceph-tools-555997499-kvl92                    1/1     Running       0          11s     10.129.2.176   ip-10-0-20-236.us-east-2.compute.internal   <none>           <none>

report-status-to-provider is in Completed state. This is expected [Expected state is Completed or Running if the pod report-status-to-provider is present].

OCP version is 4.11.31.

Comment 13 errata-xmlrpc 2023-06-21 15:24:39 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenShift Data Foundation 4.13.0 enhancement and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:3742

Note You need to log in before you can comment on or make changes to this bug.