>> we verify the product bug via regression runs and then clone it is MS and verify the actual working of the fix once we get ODF 4.11 in our Managed services clusters Yes, that should be the approach for core product fixes, it should not cause any regression. For actual testing you can open a MS BZ.
Verified in version: ODF 4.11.1-8 OCP 4.10.30 ocs-osd-deployer.v2.0.5 Followed the steps 1,2 and 3 given below. 1. All the report-to-status pods in consumer clusters should be in running or completed state as they are created from a cronJob. The pods "report-status-to-provider" are in Completed state. $ oc get pods -o wide | grep report-status-to-provider report-status-to-provider-27716284-gtw7x 0/1 Completed 0 3m7s 10.131.0.52 ip-10-0-157-29.ec2.internal <none> <none> report-status-to-provider-27716285-562t9 0/1 Completed 0 2m7s 10.131.0.53 ip-10-0-157-29.ec2.internal <none> <none> report-status-to-provider-27716286-js66g 0/1 Completed 0 67s 10.131.0.54 ip-10-0-157-29.ec2.internal <none> <none> report-status-to-provider-27716287-c8f26 0/1 Completed 0 7s 10.131.0.55 ip-10-0-157-29.ec2.internal <none> <none> 2. Check the status of storageConsumer CR, it should have a lastHeartbeat field with a timestamp and this timestamp should get update every min. "lastHeartbeat" value is changing in every minute. $ oc get storageconsumer -o yaml | grep lastHeartbeat lastHeartbeat: "2022-09-12T10:08:06Z" $ oc get storageconsumer -o yaml | grep lastHeartbeat lastHeartbeat: "2022-09-12T10:09:08Z" $ oc get storageconsumer -o yaml apiVersion: v1 items: - apiVersion: ocs.openshift.io/v1alpha1 kind: StorageConsumer metadata: annotations: ocs.openshift.io/provider-onboarding-ticket: | eyJpZCI6ImNkNjU0MDlhLTA2NDgtNGJlOS1hMjViLTk3ODJhMDlmMDZkNCIsImV4cGlyYXRpb25EYXRlIjoiMTY2MzEzNjc3MiJ9.iI/NdR4ZwI9wLU0U1TvzaMDqKgYQ0rMqV2AX78Z0wD66JUAwGatFLJ8gTkFD2ey5N0B/5pmNqdXVCsvmOseYnPER60C9SmfGXnm+Y9GgOlyxWjbQkNggTMAG59yWYj5rE0jvRZxyyZCu/O/0iiqOIp11pExRfHOHoYZRPPAjnzRUsNgon5U5Qs27LBzuy8qSQsTpY+I46Q0Mpwh5b4xxOEGq8tMwpjPXUT4p90MWKMVzfuELKPPjCf3eXPok4qO1hXLOtsa8y4zYg5MSmEqBr63rVcJd3+jQjrtSa+rb5VtfnBx254k+FJGR6j+MQGqeDTWVbh8zdCRgQyj9+VrC4bBuAZF+1wA/OKzaAzcT8oDkhssxhhNkVYNCqgFdL3KZlN2VSGcFDjJ+Ww+8Z9ObRbGZeI2gy3IqIAVCEgJtOqR7bGs/e+/uSTxJJ115XxOPl4FUFPnPCkbKnJaa/jJvqkx7p956Dg+DA8TJBLFPktnNnaTJJR2qNBx+TSNzJOkq5pPnVS/NT3CxicV/2nSHpdfjHEfKMf1cY+LEtbKUAt55wrT0b2uf5fhNS0teDtT+Y189OeBrM2HTs4NmH3Tk+Fa+BVKkanoX0wRL/NsVtMy8vYruNSbsfUCEZJ62QGPDt7WKshzqz+BvfdbxcsqPDpV8sLfEAmDxin70Rb9nQGI= creationTimestamp: "2022-09-12T06:27:59Z" finalizers: - storagesconsumer.ocs.openshift.io generation: 2 name: storageconsumer-18595625-7c68-49c0-a411-88bf84b23b60 namespace: openshift-storage resourceVersion: "613508" uid: ea0978f7-03cf-480f-a2ca-cbd74d24f8fe spec: capacity: 1Pi enable: true status: cephResources: - kind: CephClient name: 370e476009884d204effbc012fb6b36d status: Ready grantedCapacity: 1Pi lastHeartbeat: "2022-09-12T10:09:08Z" state: Ready kind: List metadata: resourceVersion: "" selfLink: "" 3. Update the storageProviderEndpoint to some random value and check the lastHeartbeat field, it will not update anymore. $ rosa edit addon ocs-consumer-qe -c jijoy-s12-c1 ? Storage Provider API Endpoint: 10.0.102.2:31222 (Add a random value of Endpoint keeping everything else as the current value. 10.0.102.2 does not exist) The value changed in the consumer cluster: $ oc get storagecluster -o yaml | grep storageProviderEndpoint storageProviderEndpoint: 10.0.102.2:31222 The value of lastHeartbeat field in the storageconsumer CR is not changing. $ date && oc get storageconsumer -o yaml | grep lastHeartbeat && sleep 300 && oc get storageconsumer -o yaml | grep lastHeartbeat Mon Sep 12 04:11:58 PM IST 2022 lastHeartbeat: "2022-09-12T10:29:07Z" lastHeartbeat: "2022-09-12T10:29:07Z" $ oc get storageconsumer -o yaml apiVersion: v1 items: - apiVersion: ocs.openshift.io/v1alpha1 kind: StorageConsumer metadata: annotations: ocs.openshift.io/provider-onboarding-ticket: | eyJpZCI6ImNkNjU0MDlhLTA2NDgtNGJlOS1hMjViLTk3ODJhMDlmMDZkNCIsImV4cGlyYXRpb25EYXRlIjoiMTY2MzEzNjc3MiJ9.iI/NdR4ZwI9wLU0U1TvzaMDqKgYQ0rMqV2AX78Z0wD66JUAwGatFLJ8gTkFD2ey5N0B/5pmNqdXVCsvmOseYnPER60C9SmfGXnm+Y9GgOlyxWjbQkNggTMAG59yWYj5rE0jvRZxyyZCu/O/0iiqOIp11pExRfHOHoYZRPPAjnzRUsNgon5U5Qs27LBzuy8qSQsTpY+I46Q0Mpwh5b4xxOEGq8tMwpjPXUT4p90MWKMVzfuELKPPjCf3eXPok4qO1hXLOtsa8y4zYg5MSmEqBr63rVcJd3+jQjrtSa+rb5VtfnBx254k+FJGR6j+MQGqeDTWVbh8zdCRgQyj9+VrC4bBuAZF+1wA/OKzaAzcT8oDkhssxhhNkVYNCqgFdL3KZlN2VSGcFDjJ+Ww+8Z9ObRbGZeI2gy3IqIAVCEgJtOqR7bGs/e+/uSTxJJ115XxOPl4FUFPnPCkbKnJaa/jJvqkx7p956Dg+DA8TJBLFPktnNnaTJJR2qNBx+TSNzJOkq5pPnVS/NT3CxicV/2nSHpdfjHEfKMf1cY+LEtbKUAt55wrT0b2uf5fhNS0teDtT+Y189OeBrM2HTs4NmH3Tk+Fa+BVKkanoX0wRL/NsVtMy8vYruNSbsfUCEZJ62QGPDt7WKshzqz+BvfdbxcsqPDpV8sLfEAmDxin70Rb9nQGI= creationTimestamp: "2022-09-12T06:27:59Z" finalizers: - storagesconsumer.ocs.openshift.io generation: 2 name: storageconsumer-18595625-7c68-49c0-a411-88bf84b23b60 namespace: openshift-storage resourceVersion: "651745" uid: ea0978f7-03cf-480f-a2ca-cbd74d24f8fe spec: capacity: 1Pi enable: true status: cephResources: - kind: CephClient name: 370e476009884d204effbc012fb6b36d status: Ready grantedCapacity: 1Pi lastHeartbeat: "2022-09-12T10:29:07Z" state: Ready kind: List metadata: resourceVersion: "" selfLink: "" The pods "report-status-to-provider" on the consumer cluster are not in the correct state. This is expected with a wrong value of storageproviderEndpoint. $ oc get pods -o wide | grep report-status-to-provider report-status-to-provider-27716307-g2w45 0/1 Completed 0 17m 10.129.2.68 ip-10-0-173-22.ec2.internal <none> <none> report-status-to-provider-27716308-v99ft 0/1 Completed 0 16m 10.131.0.66 ip-10-0-157-29.ec2.internal <none> <none> report-status-to-provider-27716309-66wf4 0/1 Completed 0 15m 10.131.0.67 ip-10-0-157-29.ec2.internal <none> <none> report-status-to-provider-27716318-gv6fg 0/1 CrashLoopBackOff 5 (109s ago) 6m7s 10.129.2.74 ip-10-0-173-22.ec2.internal <none> <none> report-status-to-provider-27716319-6tgtc 0/1 CrashLoopBackOff 5 (43s ago) 5m7s 10.131.0.74 ip-10-0-157-29.ec2.internal <none> <none> report-status-to-provider-27716320-fz9jg 1/1 Running 5 (90s ago) 4m7s 10.129.2.75 ip-10-0-173-22.ec2.internal <none> <none> report-status-to-provider-27716321-2jnrt 0/1 CrashLoopBackOff 4 (26s ago) 3m7s 10.131.0.75 ip-10-0-157-29.ec2.internal <none> <none> report-status-to-provider-27716322-xndcm 0/1 CrashLoopBackOff 3 (23s ago) 2m7s 10.131.0.76 ip-10-0-157-29.ec2.internal <none> <none> report-status-to-provider-27716323-rgdz4 0/1 Error 2 (37s ago) 67s 10.129.2.76 ip-10-0-173-22.ec2.internal <none> <none> report-status-to-provider-27716324-xtrlx 1/1 Running 0 7s 10.129.2.77 ip-10-0-173-22.ec2.internal <none> <none>
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenShift Data Foundation 4.11.1 Bug Fix Update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:6525