Bug 2000027
| Summary: | [AWS]: [odf-operator.v4.9.0-120.ci] storagecluster is in Progressing state | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat OpenShift Data Foundation | Reporter: | Vijay Avuthu <vavuthu> |
| Component: | Multi-Cloud Object Gateway | Assignee: | Jacky Albo <jalbo> |
| Status: | CLOSED NOTABUG | QA Contact: | Raz Tamir <ratamir> |
| Severity: | urgent | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 4.9 | CC: | ebenahar, etamir, kramdoss, madam, muagarwa, nbecker, ocs-bugs, odf-bz-bot, pbalogh, sostapov |
| Target Milestone: | --- | Keywords: | Automation |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-10-05 07:13:09 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Vijay Avuthu
2021-09-01 08:33:51 UTC
Vijay, is this reproducible in the latest build? I thought that I had another reproduce here as I saw that cluster was stuck in progressing state, but after looking at logs, it looks like different issue. So opened new bug here: https://bugzilla.redhat.com/show_bug.cgi?id=2002220 Update: ======= > Tried with latest build: 4.9.0-129.ci and its failed with same state > $ oc get storagecluster NAME AGE PHASE EXTERNAL CREATED AT VERSION ocs-storagecluster 84m Progressing 2021-09-08T10:34:17Z 4.9.0 > $ oc describe storagecluster ocs-storagecluster Name: ocs-storagecluster Namespace: openshift-storage Labels: <none> Annotations: storagesystem.odf.openshift.io/watched-by: storagesystem-odf uninstall.ocs.openshift.io/cleanup-policy: delete uninstall.ocs.openshift.io/mode: graceful API Version: ocs.openshift.io/v1 Status: Conditions: Last Heartbeat Time: 2021-09-08T11:58:28Z Last Transition Time: 2021-09-08T10:38:48Z Message: Reconcile completed successfully Reason: ReconcileCompleted Status: True Type: ReconcileComplete Last Heartbeat Time: 2021-09-08T10:34:17Z Last Transition Time: 2021-09-08T10:34:17Z Message: Initializing StorageCluster Reason: Init Status: False Type: Available Last Heartbeat Time: 2021-09-08T11:58:28Z Last Transition Time: 2021-09-08T10:34:17Z Message: Waiting on Nooba instance to finish initialization Reason: NoobaaInitializing Status: True Type: Progressing Last Heartbeat Time: 2021-09-08T10:34:17Z Last Transition Time: 2021-09-08T10:34:17Z Message: Initializing StorageCluster Reason: Init Status: False Type: Degraded Last Heartbeat Time: 2021-09-08T10:34:17Z Last Transition Time: 2021-09-08T10:34:17Z Message: Initializing StorageCluster Reason: Init Status: Unknown Type: Upgradeable > describe of noobaendpoint $ oc describe pod noobaa-endpoint-dcc9c5d9d-tm8wj Name: noobaa-endpoint-dcc9c5d9d-tm8wj Namespace: openshift-storage Priority: 0 Node: <none> Labels: app=noobaa noobaa-s3=noobaa pod-template-hash=dcc9c5d9d Annotations: openshift.io/scc: noobaa-endpoint Status: Pending Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 80m default-scheduler 0/6 nodes are available: 3 Insufficient cpu, 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate. Warning FailedScheduling 78m default-scheduler 0/6 nodes are available: 3 Insufficient cpu, 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate. > Deploy is with allowing lower requirements > storagecluster 10:34:12 - MainThread - ocs_ci.utility.templating - INFO - apiVersion: ocs.openshift.io/v1 kind: StorageCluster metadata: name: ocs-storagecluster namespace: openshift-storage spec: resources: mds: Limits: null Requests: null mgr: Limits: null Requests: null mon: Limits: null Requests: null noobaa-core: Limits: null Requests: null noobaa-db: Limits: null Requests: null noobaa-endpoint: limits: cpu: 1 memory: 500Mi requests: cpu: 1 memory: 500Mi rgw: Limits: null Requests: null storageDeviceSets: - count: 1 dataPVCTemplate: spec: accessModes: - ReadWriteOnce resources: requests: storage: 100Gi storageClassName: gp2 volumeMode: Block name: ocs-deviceset placement: {} portable: true replica: 3 resources: Limits: null Requests: null > As part of bug https://bugzilla.redhat.com/show_bug.cgi?id=1885313 , we have values for noobaendpoint, instead of empty objects > Not sure what changes recently but with the above StorageCluster values, deployment used to be succesfull previously. Job : https://ocs4-jenkins-csb-ocsqe.apps.ocp4.prod.psi.redhat.com/job/qe-deploy-ocs-cluster/5852/console must gather: http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/vavuthu-bz027/vavuthu-bz027_20210908T091752/logs/failed_testcase_ocs_logs_1631093767/test_deployment_ocs_logs/ |