Description of problem (please be detailed as possible and provide log snippests): ------------------------------------------------------------------------ OCS storagecluster in Progressing state and few of the noobaa pods are missing in the latest OCS 4.7 build. Observed same issue in both Internal(build email- ocs-ci) and Internal-attached(manually deployed on AWS I3 via UI) deployments $ oc get csv -A NAMESPACE NAME DISPLAY VERSION REPLACES PHASE openshift-storage ocs-operator.v4.7.0-223.ci OpenShift Container Storage 4.7.0-223.ci Succeeded $ oc get storagecluster -A NAMESPACE NAME AGE PHASE EXTERNAL CREATED AT VERSION openshift-storage ocs-storagecluster 6h28m Progressing 2021-01-05T07:49:08Z 4.8.0 It is seen that noobaa db and endpoint pods are missing $ oc get pods -o wide -A|grep noobaa openshift-storage noobaa-core-0 1/1 Running 0 6h29m 10.129.2.36 ip-10-0-222-187.us-east-2.compute.internal <none> <none> openshift-storage noobaa-operator-9f697d45c-9l8cq 1/1 Running 0 6h44m 10.129.2.24 ip-10-0-222-187.us-east-2.compute.internal <none> <none> Version of all relevant components (if applicable): =================================================== OCP = 4.7.0-0.nightly-2021-01-04-215816 OCS = ocs-operator.v4.7.0-223.ci Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? ------------------------------------------------------------------ Yes the deployment of OCS is failing Is there any workaround available to the best of your knowledge? ---------------------------------------------- Not sure Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 very simple, 5 - very complex)? ------------------------------------------ 3 Can this issue reproducible? ------------------------------ Yes Can this issue reproduce from the UI? ----------------------------------- Yes we tested both ocs-ci based deployments [1] and manual deployment on AWS I3 via UI If this is a regression, please provide more details to justify this: ==================================================================== Yes Steps to Reproduce: ====================== 1. Install OCP 4.7 2. Deploy OCS build 4.7.0-223 either via ocs-ci or manually from UI 3. Actual results: --------------------- Storagecluster is stuck in Progressing state and some noobaa pods(db and endpoint) are not created Expected results: ===================== Storagecluster creation should succeed build emails and run details ================================ Email - A new OCS 4.7 build is available: ocs-registry:4.7.0-223.ci [1] - OCS CI run: https://storage-jenkins-csb-ceph.cloud.paas.psi.redhat.com/job/ocs-ci/188/ Manual install on AWS I3 [2] -LSO from UI : https://ocs4-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/qe-deploy-ocs-cluster/16013/ Additional info: ====================== Outputs from LSO cluster $ oc describe storagecluster ocs-storagecluster -n openshift-storage Status: Conditions: Last Heartbeat Time: 2021-01-05T14:18:34Z Last Transition Time: 2021-01-05T07:49:10Z Message: Reconcile completed successfully Reason: ReconcileCompleted Status: True Type: ReconcileComplete Last Heartbeat Time: 2021-01-05T07:49:16Z Last Transition Time: 2021-01-05T07:49:11Z Message: CephCluster resource is not reporting status Reason: CephClusterStatus Status: False Type: Available Last Heartbeat Time: 2021-01-05T14:18:34Z Last Transition Time: 2021-01-05T07:49:11Z Message: Waiting on Nooba instance to finish initialization Reason: NoobaaInitializing Status: True Type: Progressing Last Heartbeat Time: 2021-01-05T07:49:10Z Last Transition Time: 2021-01-05T07:49:08Z Message: Reconcile completed successfully Reason: ReconcileCompleted Status: False Type: Degraded Last Heartbeat Time: 2021-01-05T07:50:40Z Last Transition Time: 2021-01-05T07:49:11Z Message: CephCluster is creating: Cluster is creating Reason: ClusterStateCreating Status: False Type: Upgradeable POD list from a previous build where the db and endpoint pods were created =========================================================================== Run ID = https://storage-jenkins-csb-ceph.cloud.paas.psi.redhat.com/job/ocs-ci/187/ NAME DISPLAY VERSION REPLACES PHASE ocs-operator.v4.7.0-222.ci OpenShift Container Storage 4.7.0-222.ci Succeeded $NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES csi-cephfsplugin-kkrmk 3/3 Running 0 37m 10.0.142.40 ip-10-0-142-40.us-west-1.compute.internal <none> <none> csi-cephfsplugin-lnh56 3/3 Running 0 37m 10.0.166.255 ip-10-0-166-255.us-west-1.compute.internal <none> <none> csi-cephfsplugin-provisioner-5c5b96fb84-ttbbp 6/6 Running 0 37m 10.128.2.17 ip-10-0-234-224.us-west-1.compute.internal <none> <none> csi-cephfsplugin-provisioner-5c5b96fb84-xgblr 6/6 Running 0 37m 10.131.0.33 ip-10-0-166-255.us-west-1.compute.internal <none> <none> csi-cephfsplugin-wmvz4 3/3 Running 0 37m 10.0.234.224 ip-10-0-234-224.us-west-1.compute.internal <none> <none> csi-rbdplugin-2fj66 3/3 Running 0 37m 10.0.142.40 ip-10-0-142-40.us-west-1.compute.internal <none> <none> csi-rbdplugin-p48bq 3/3 Running 0 37m 10.0.166.255 ip-10-0-166-255.us-west-1.compute.internal <none> <none> csi-rbdplugin-pkp2l 3/3 Running 0 37m 10.0.234.224 ip-10-0-234-224.us-west-1.compute.internal <none> <none> csi-rbdplugin-provisioner-55c8b8c747-5kq9d 6/6 Running 0 37m 10.128.2.16 ip-10-0-234-224.us-west-1.compute.internal <none> <none> csi-rbdplugin-provisioner-55c8b8c747-m5klz 6/6 Running 0 37m 10.129.2.14 ip-10-0-142-40.us-west-1.compute.internal <none> <none> noobaa-core-0 1/1 Running 0 34m 10.131.0.36 ip-10-0-166-255.us-west-1.compute.internal <none> <none> noobaa-db-pg-0 1/1 Running 0 34m 10.129.2.20 ip-10-0-142-40.us-west-1.compute.internal <none> <none> noobaa-endpoint-5f4665cb7c-nldm2 1/1 Running 0 32m 10.129.2.23 ip-10-0-142-40.us-west-1.compute.internal <none> <none> noobaa-endpoint-5f4665cb7c-tktnw 1/1 Running 0 16m 10.128.2.26 ip-10-0-234-224.us-west-1.compute.internal <none> <none> noobaa-operator-f7cf9598d-j74p8 1/1 Running 0 38m 10.128.2.14 ip-10-0-234-224.us-west-1.compute.internal <none> <none> ocs-metrics-exporter-8cd4d6857-m29kc 1/1 Running 0 38m 10.131.0.31 ip-10-0-166-255.us-west-1.compute.internal <none> <none> ocs-operator-869fc6777c-4b4mj 1/1 Running 0 37m 10.131.0.32 ip-10-0-166-255.us-west-1.compute.internal <none> <none> rook-ceph-crashcollector-ip-10-0-142-40-65c46c9f49-tjlhj 1/1 Running 0 35m 10.129.2.17 ip-10-0-142-40.us-west-1.compute.internal <none> <none> rook-ceph-crashcollector-ip-10-0-166-255-75c48f9cfc-br2hr 1/1 Running 0 36m 10.131.0.37 ip-10-0-166-255.us-west-1.compute.internal <none> <none> rook-ceph-crashcollector-ip-10-0-234-224-694b5b7dc5-nhtjw 1/1 Running 0 35m 10.128.2.19 ip-10-0-234-224.us-west-1.compute.internal <none> <none> rook-ceph-mds-ocs-storagecluster-cephfilesystem-a-5b944987wtkq2 1/1 Running 0 34m 10.128.2.22 ip-10-0-234-224.us-west-1.compute.internal <none> <none> rook-ceph-mds-ocs-storagecluster-cephfilesystem-b-5b6c5955zr8rr 1/1 Running 0 34m 10.131.0.38 ip-10-0-166-255.us-west-1.compute.internal <none> <none> rook-ceph-mgr-a-5668bd7756-g6dpc 1/1 Running 0 35m 10.129.2.16 ip-10-0-142-40.us-west-1.compute.internal <none> <none> rook-ceph-mon-a-6cfdf59957-9l7dv 1/1 Running 0 36m 10.131.0.34 ip-10-0-166-255.us-west-1.compute.internal <none> <none> rook-ceph-mon-b-5bbf66dc8-gsxqx 1/1 Running 0 35m 10.129.2.15 ip-10-0-142-40.us-west-1.compute.internal <none> <none> rook-ceph-mon-c-85c47768ff-65lpd 1/1 Running 0 35m 10.128.2.18 ip-10-0-234-224.us-west-1.compute.internal <none> <none> rook-ceph-operator-5bd49bb764-2xvbp 1/1 Running 0 38m 10.129.2.13 ip-10-0-142-40.us-west-1.compute.internal <none> <none> rook-ceph-osd-0-58d796f464-458f7 1/1 Running 0 34m 10.131.0.39 ip-10-0-166-255.us-west-1.compute.internal <none> <none> rook-ceph-osd-1-5df77df6cf-5wt4f 1/1 Running 0 34m 10.128.2.21 ip-10-0-234-224.us-west-1.compute.internal <none> <none> rook-ceph-osd-2-848b5d8858-ck7k9 1/1 Running 0 34m 10.129.2.19 ip-10-0-142-40.us-west-1.compute.internal <none> <none> rook-ceph-osd-prepare-ocs-deviceset-0-data-0-ll2l8-7md26 0/1 Completed 0 35m 10.129.2.18 ip-10-0-142-40.us-west-1.compute.internal <none> <none> rook-ceph-osd-prepare-ocs-deviceset-1-data-0-62mjc-4x9ws 0/1 Completed 0 35m 10.131.0.35 ip-10-0-166-255.us-west-1.compute.internal <none> <none> rook-ceph-osd-prepare-ocs-deviceset-2-data-0-pzwqz-k6zsr 0/1 Completed 0 34m 10.128.2.20 ip-10-0-234-224.us-west-1.compute.internal <none> <none> rook-ceph-tools-8575486ffd-p77ks 1/1 Running 0 33m 10.0.142.40 ip-10-0-142-40.us-west-1.compute.internal <none> <none>
@
Umanga, can you please check https://bugzilla.redhat.com/show_bug.cgi?id=1912894#c5
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Red Hat OpenShift Container Storage 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2041