Description of problem (please be detailed as possible and provide log snippests): storagecluster is in Progressing state Version of all relevant components (if applicable): odf-operator.v4.9.0-120.ci Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? Yes Is there any workaround available to the best of your knowledge? NA Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? 1 Can this issue reproducible? 1/1 Can this issue reproduce from the UI? Not Tried If this is a regression, please provide more details to justify this: Steps to Reproduce: 1. install OCS using ocs-ci ( PR 4647 ) 2. check storagecluster status 3. Actual results: $ oc get storagecluster NAME AGE PHASE EXTERNAL CREATED AT VERSION ocs-storagecluster 44m Progressing 2021-09-01T07:40:26Z 4.9.0 Expected results: storagecluster should be in Ready phase Additional info: > All csv are in succeeded phase $ oc get csv NAME DISPLAY VERSION REPLACES PHASE noobaa-operator.v4.9.0-120.ci NooBaa Operator 4.9.0-120.ci Succeeded ocs-operator.v4.9.0-120.ci OpenShift Container Storage 4.9.0-120.ci Succeeded odf-operator.v4.9.0-120.ci OpenShift Data Foundation 4.9.0-120.ci Succeeded > storagecluster status $ oc get storagecluster NAME AGE PHASE EXTERNAL CREATED AT VERSION ocs-storagecluster 47m Progressing 2021-09-01T07:40:26Z 4.9.0 > $ oc describe storagecluster ocs-storagecluster Name: ocs-storagecluster Namespace: openshift-storage Status: Conditions: Last Heartbeat Time: 2021-09-01T08:28:29Z Last Transition Time: 2021-09-01T07:45:41Z Message: Reconcile completed successfully Reason: ReconcileCompleted Status: True Type: ReconcileComplete Last Heartbeat Time: 2021-09-01T07:40:26Z Last Transition Time: 2021-09-01T07:40:26Z Message: Initializing StorageCluster Reason: Init Status: False Type: Available Last Heartbeat Time: 2021-09-01T08:28:29Z Last Transition Time: 2021-09-01T07:40:26Z Message: Waiting on Nooba instance to finish initialization Reason: NoobaaInitializing Status: True Type: Progressing Last Heartbeat Time: 2021-09-01T07:40:26Z Last Transition Time: 2021-09-01T07:40:26Z Message: Initializing StorageCluster Reason: Init Status: False Type: Degraded Last Heartbeat Time: 2021-09-01T07:40:26Z Last Transition Time: 2021-09-01T07:40:26Z Message: Initializing StorageCluster Reason: Init Status: Unknown Type: Upgradeable Failure Domain: zone Failure Domain Key: topology.kubernetes.io/zone > nooba operator log time="2021-09-01T08:19:59Z" level=warning msg="⏳ Temporary Error: not enough available replicas in endpoint deployment" sys=openshift-storage/noobaa time="2021-09-01T08:19:59Z" level=info msg="UpdateStatus: Done generation 1" sys=openshift-storage/noobaa > job: https://ocs4-jenkins-csb-ocsqe.apps.ocp4.prod.psi.redhat.com/job/qe-deploy-ocs-cluster/5656/console
must gather logs: http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/vavuthuaws2-pr4647/vavuthuaws2-pr4647_20210901T061730/logs/failed_testcase_ocs_logs_1630478606/test_deployment_ocs_logs/
Vijay, is this reproducible in the latest build?
I thought that I had another reproduce here as I saw that cluster was stuck in progressing state, but after looking at logs, it looks like different issue. So opened new bug here: https://bugzilla.redhat.com/show_bug.cgi?id=2002220
Update: ======= > Tried with latest build: 4.9.0-129.ci and its failed with same state > $ oc get storagecluster NAME AGE PHASE EXTERNAL CREATED AT VERSION ocs-storagecluster 84m Progressing 2021-09-08T10:34:17Z 4.9.0 > $ oc describe storagecluster ocs-storagecluster Name: ocs-storagecluster Namespace: openshift-storage Labels: <none> Annotations: storagesystem.odf.openshift.io/watched-by: storagesystem-odf uninstall.ocs.openshift.io/cleanup-policy: delete uninstall.ocs.openshift.io/mode: graceful API Version: ocs.openshift.io/v1 Status: Conditions: Last Heartbeat Time: 2021-09-08T11:58:28Z Last Transition Time: 2021-09-08T10:38:48Z Message: Reconcile completed successfully Reason: ReconcileCompleted Status: True Type: ReconcileComplete Last Heartbeat Time: 2021-09-08T10:34:17Z Last Transition Time: 2021-09-08T10:34:17Z Message: Initializing StorageCluster Reason: Init Status: False Type: Available Last Heartbeat Time: 2021-09-08T11:58:28Z Last Transition Time: 2021-09-08T10:34:17Z Message: Waiting on Nooba instance to finish initialization Reason: NoobaaInitializing Status: True Type: Progressing Last Heartbeat Time: 2021-09-08T10:34:17Z Last Transition Time: 2021-09-08T10:34:17Z Message: Initializing StorageCluster Reason: Init Status: False Type: Degraded Last Heartbeat Time: 2021-09-08T10:34:17Z Last Transition Time: 2021-09-08T10:34:17Z Message: Initializing StorageCluster Reason: Init Status: Unknown Type: Upgradeable > describe of noobaendpoint $ oc describe pod noobaa-endpoint-dcc9c5d9d-tm8wj Name: noobaa-endpoint-dcc9c5d9d-tm8wj Namespace: openshift-storage Priority: 0 Node: <none> Labels: app=noobaa noobaa-s3=noobaa pod-template-hash=dcc9c5d9d Annotations: openshift.io/scc: noobaa-endpoint Status: Pending Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 80m default-scheduler 0/6 nodes are available: 3 Insufficient cpu, 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate. Warning FailedScheduling 78m default-scheduler 0/6 nodes are available: 3 Insufficient cpu, 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate. > Deploy is with allowing lower requirements > storagecluster 10:34:12 - MainThread - ocs_ci.utility.templating - INFO - apiVersion: ocs.openshift.io/v1 kind: StorageCluster metadata: name: ocs-storagecluster namespace: openshift-storage spec: resources: mds: Limits: null Requests: null mgr: Limits: null Requests: null mon: Limits: null Requests: null noobaa-core: Limits: null Requests: null noobaa-db: Limits: null Requests: null noobaa-endpoint: limits: cpu: 1 memory: 500Mi requests: cpu: 1 memory: 500Mi rgw: Limits: null Requests: null storageDeviceSets: - count: 1 dataPVCTemplate: spec: accessModes: - ReadWriteOnce resources: requests: storage: 100Gi storageClassName: gp2 volumeMode: Block name: ocs-deviceset placement: {} portable: true replica: 3 resources: Limits: null Requests: null > As part of bug https://bugzilla.redhat.com/show_bug.cgi?id=1885313 , we have values for noobaendpoint, instead of empty objects > Not sure what changes recently but with the above StorageCluster values, deployment used to be succesfull previously. Job : https://ocs4-jenkins-csb-ocsqe.apps.ocp4.prod.psi.redhat.com/job/qe-deploy-ocs-cluster/5852/console must gather: http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/vavuthu-bz027/vavuthu-bz027_20210908T091752/logs/failed_testcase_ocs_logs_1631093767/test_deployment_ocs_logs/