Bug 1998065 - storagecluster is in progressing state in v4.9.0-115.ci
Summary: storagecluster is in progressing state in v4.9.0-115.ci
Keywords:
Status: CLOSED DUPLICATE of bug 1996033
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: Multi-Cloud Object Gateway
Version: 4.9
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: ---
Assignee: Nimrod Becker
QA Contact: Raz Tamir
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-08-26 11:47 UTC by Vijay Avuthu
Modified: 2023-08-09 16:49 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-08-27 15:58:35 UTC
Embargoed:


Attachments (Terms of Use)

Description Vijay Avuthu 2021-08-26 11:47:27 UTC
Description of problem:

storagecluster is in progressing state in v4.9.0-115.ci

Version-Release number of selected component (if applicable):

odf-operator.v4.9.0-115.ci
openshift installer (4.9.0-0.nightly-2021-08-26-040328)

How reproducible:
2/2

Steps to Reproduce:
1. install OCS using ocs-ci ( using https://github.com/red-hat-storage/ocs-ci/pull/4647 )
2. check storagecluster state
3.

Actual results:

$ oc get storagecluster
NAME                 AGE   PHASE         EXTERNAL   CREATED AT             VERSION
ocs-storagecluster   46m   Progressing              2021-08-26T10:16:25Z   4.9.0
$


Expected results:

storagecluste should be in succeeded state


Additional info:

$ oc describe storagecluster ocs-storagecluster
Name:         ocs-storagecluster
Namespace:    openshift-storage
Labels:       <none>
Annotations:  storagesystem.odf.openshift.io/watched-by: storagesystem-odf
              uninstall.ocs.openshift.io/cleanup-policy: delete
              uninstall.ocs.openshift.io/mode: graceful
API Version:  ocs.openshift.io/v1
Kind:         StorageCluster
Metadata:


Status:
  Conditions:
    Last Heartbeat Time:   2021-08-26T11:06:44Z
    Last Transition Time:  2021-08-26T10:16:54Z
    Message:               Reconcile completed successfully
    Reason:                ReconcileCompleted
    Status:                True
    Type:                  ReconcileComplete
    Last Heartbeat Time:   2021-08-26T10:16:27Z
    Last Transition Time:  2021-08-26T10:16:25Z
    Message:               CephCluster resource is not reporting status
    Reason:                CephClusterStatus
    Status:                False
    Type:                  Available
    Last Heartbeat Time:   2021-08-26T11:06:44Z
    Last Transition Time:  2021-08-26T10:16:25Z
    Message:               Waiting on Nooba instance to finish initialization
    Reason:                NoobaaInitializing
    Status:                True
    Type:                  Progressing
    Last Heartbeat Time:   2021-08-26T10:16:25Z
    Last Transition Time:  2021-08-26T10:16:25Z
    Message:               Initializing StorageCluster
    Reason:                Init
    Status:                False
    Type:                  Degraded
    Last Heartbeat Time:   2021-08-26T10:20:40Z
    Last Transition Time:  2021-08-26T10:16:27Z
    Message:               CephCluster is creating: Processing OSD 2 on PVC "ocs-deviceset-0-data-0q44s8"
    Reason:                ClusterStateCreating
    Status:                False
    Type:                  Upgradeable
  Failure Domain:          rack


Job: https://ocs4-jenkins-csb-ocsqe.apps.ocp4.prod.psi.redhat.com/job/qe-deploy-ocs-cluster/5510/console

must gather: http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/vavuthuodfx-pr4647/vavuthuodfx-pr4647_20210826T091946/logs/failed_testcase_ocs_logs_1629970584/test_deployment_ocs_logs/

Comment 4 Jose A. Rivera 2021-08-27 13:25:01 UTC
First step in troubleshooting from the must-gather output, check the StorageCluster Conditions:

  Conditions:
    Last Heartbeat Time:   2021-08-26T11:23:44Z
    Last Transition Time:  2021-08-26T10:16:54Z
    Message:               Reconcile completed successfully
    Reason:                ReconcileCompleted
    Status:                True
    Type:                  ReconcileComplete
    Last Heartbeat Time:   2021-08-26T10:16:27Z
    Last Transition Time:  2021-08-26T10:16:25Z
    Message:               CephCluster resource is not reporting status
    Reason:                CephClusterStatus
    Status:                False
    Type:                  Available
    Last Heartbeat Time:   2021-08-26T11:23:44Z
    Last Transition Time:  2021-08-26T10:16:25Z
    Message:               Waiting on Nooba instance to finish initialization
    Reason:                NoobaaInitializing
    Status:                True
    Type:                  Progressing
    Last Heartbeat Time:   2021-08-26T10:16:25Z
    Last Transition Time:  2021-08-26T10:16:25Z
    Message:               Initializing StorageCluster
    Reason:                Init
    Status:                False
    Type:                  Degraded
    Last Heartbeat Time:   2021-08-26T10:20:40Z
    Last Transition Time:  2021-08-26T10:16:27Z
    Message:               CephCluster is creating: Processing OSD 2 on PVC "ocs-deviceset-0-data-0q44s8"
    Reason:                ClusterStateCreating
    Status:                False
    Type:                  Upgradeable

CephCluster looks fine, so that may be stale state. Looking at noobaa-operator logs:

time="2021-08-26T11:27:06Z" level=info msg="Will connect to RGW at \"https://rook-ceph-rgw-ocs-storagecluster-cephobjectstore.openshift-storage.svc:443\"" sys=openshift-storage/noobaa
time="2021-08-26T11:27:06Z" level=info msg="creating bucket nb.1629977226666.apps.vavuthuodfx-pr464.qe.rh-ocs.com" sys=openshift-storage/noobaa
time="2021-08-26T11:27:06Z" level=error msg="got error when trying to create bucket nb.1629977226666.apps.vavuthuodfx-pr464.qe.rh-ocs.com. error: RequestError: send request failed\ncaused by: Put \"https://rook-ceph-rgw-ocs-storagecluster-cephobjectstore.openshift-storage.svc:443/nb.1629977226666.apps.vavuthuodfx-pr464.qe.rh-ocs.com\": x509: certificate signed by unknown authority" sys=openshift-storage/noobaa
time="2021-08-26T11:27:06Z" level=info msg="SetPhase: temporary error during phase \"Configuring\"" sys=openshift-storage/noobaa
time="2021-08-26T11:27:06Z" level=warning msg="â³ Temporary Error: RequestError: send request failed\ncaused by: Put \"https://rook-ceph-rgw-ocs-storagecluster-cephobjectstore.openshift-storage.svc:443/nb.1629977226666.apps.vavuthuodfx-pr464.qe.rh-ocs.com\": x509: certificate signed by unknown authority" sys=openshift-storage/noobaa

Offhand looks like a cert issue with NooBaa. Nimrod PTAL on Sunday.

Comment 5 umanga 2021-08-27 15:58:35 UTC

*** This bug has been marked as a duplicate of bug 1996033 ***


Note You need to log in before you can comment on or make changes to this bug.