Bug 1937604 - [Deployment blocker]ocs-operator.v4.8.0-292.ci is in installing phase
Summary: [Deployment blocker]ocs-operator.v4.8.0-292.ci is in installing phase
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenShift Container Storage
Classification: Red Hat Storage
Component: Multi-Cloud Object Gateway
Version: 4.8
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: OCS 4.8.0
Assignee: Liran Mauda
QA Contact: Vijay Avuthu
URL:
Whiteboard:
: 1938548 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-03-11 06:23 UTC by Vijay Avuthu
Modified: 2021-08-03 18:15 UTC (History)
7 users (show)

Fixed In Version: 4.8.0-300.ci
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-08-03 18:15:14 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2021:3003 0 None None None 2021-08-03 18:15:46 UTC

Description Vijay Avuthu 2021-03-11 06:23:39 UTC
Description of problem (please be detailed as possible and provide log
snippests):

ocs-operator.v4.8.0-292.ci is in installing phase


Version of all relevant components (if applicable):

openshift installer (4.8.0-0.nightly-2021-03-10-142839)
ocs-registry:4.8.0-292.ci


Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
Not able to intsall OCS


Is there any workaround available to the best of your knowledge?
No

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
1

Can this issue reproducible?
yes

Can this issue reproduce from the UI?
N/A

If this is a regression, please provide more details to justify this:
This is the 1st build in OCS 4.8 ( previous build has crush map issue )

Steps to Reproduce:
1. Install OCS using ocs-ci
2. verify the ocs-operator
3.


Actual results:

operator is in Installing state
$ oc get csv
NAME                         DISPLAY                       VERSION        REPLACES   PHASE
ocs-operator.v4.8.0-292.ci   OpenShift Container Storage   4.8.0-292.ci              Installing
$ 



Expected results:
Operator should be in succeeded state


Additional info:

Jenkins job: https://ocs4-jenkins-csb-ocsqe.apps.ocp4.prod.psi.redhat.com/job/qe-deploy-ocs-cluster/1191/consoleFull

must gather: http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/vavuthu4-te/vavuthu4-te_20210311T035835/logs/failed_testcase_ocs_logs_1615435333/test_deployment_ocs_logs/

Comment 3 Vijay Avuthu 2021-03-11 06:39:49 UTC
> Events from operator

$ oc get csv -n openshift-storage
NAME                         DISPLAY                       VERSION        REPLACES   PHASE
ocs-operator.v4.8.0-292.ci   OpenShift Container Storage   4.8.0-292.ci              Installing
$

$ oc describe csv ocs-operator.v4.8.0-292.ci


  Type     Reason               Age                   From                        Message
  ----     ------               ----                  ----                        -------
  Normal   RequirementsUnknown  95m (x2 over 95m)     operator-lifecycle-manager  requirements not yet checked
  Normal   RequirementsNotMet   95m (x2 over 95m)     operator-lifecycle-manager  one or more requirements couldn't be found
  Normal   InstallWaiting       94m (x2 over 94m)     operator-lifecycle-manager  installing: waiting for deployment rook-ceph-operator to become ready: Waiting for rollout to finish: 0 of 1 updated replicas are available...
  Normal   InstallSucceeded     94m                   operator-lifecycle-manager  install strategy completed with no errors
  Warning  ComponentUnhealthy   94m                   operator-lifecycle-manager  installing: waiting for deployment ocs-operator to become ready: Waiting for rollout to finish: 0 of 1 updated replicas are available...
  Normal   InstallSucceeded     94m (x3 over 95m)     operator-lifecycle-manager  waiting for install components to report healthy
  Normal   InstallWaiting       94m (x2 over 95m)     operator-lifecycle-manager  installing: waiting for deployment ocs-operator to become ready: Waiting for rollout to finish: 0 of 1 updated replicas are available...
  Normal   NeedsReinstall       89m (x4 over 94m)     operator-lifecycle-manager  installing: waiting for deployment ocs-operator to become ready: Waiting for rollout to finish: 0 of 1 updated replicas are available...
  Normal   AllRequirementsMet   89m (x7 over 95m)     operator-lifecycle-manager  all requirements found, attempting install
  Warning  InstallCheckFailed   2m52s (x32 over 89m)  operator-lifecycle-manager  install timeout

> noobaa-core is in CLBO state


$ oc get pods -n openshift-storage
NAME                                                              READY   STATUS             RESTARTS   AGE
csi-cephfsplugin-provisioner-68dd8867d9-7kxvg                     6/6     Running            0          99m
csi-cephfsplugin-provisioner-68dd8867d9-h5zsb                     6/6     Running            0          99m
csi-cephfsplugin-rh8l4                                            3/3     Running            0          99m
csi-cephfsplugin-rxmwx                                            3/3     Running            0          99m
csi-cephfsplugin-xjlwk                                            3/3     Running            0          99m
csi-rbdplugin-8lhjt                                               3/3     Running            0          99m
csi-rbdplugin-ckpql                                               3/3     Running            0          99m
csi-rbdplugin-provisioner-6cc7489f9d-c858d                        6/6     Running            0          99m
csi-rbdplugin-provisioner-6cc7489f9d-zqfn2                        6/6     Running            0          99m
csi-rbdplugin-zvthz                                               3/3     Running            0          99m
noobaa-core-0                                                     0/1     CrashLoopBackOff   23         96m
noobaa-db-pg-0                                                    1/1     Running            0          96m
noobaa-operator-7c588b4c8d-622f8                                  1/1     Running            0          100m
ocs-metrics-exporter-54d98c964b-jr7c4                             1/1     Running            0          100m
ocs-operator-7678bb976d-v7lps                                     0/1     Running            0          100m
rook-ceph-crashcollector-compute-0-7b9d48779c-vwq5h               1/1     Running            0          96m
rook-ceph-crashcollector-compute-1-7988c5cc86-qbwj6               1/1     Running            0          96m
rook-ceph-crashcollector-compute-2-78f9648d87-d5b4p               1/1     Running            0          96m
rook-ceph-mds-ocs-storagecluster-cephfilesystem-a-bc8968d9rd9jd   2/2     Running            0          96m
rook-ceph-mds-ocs-storagecluster-cephfilesystem-b-76bf9669tcjj2   2/2     Running            0          96m
rook-ceph-mgr-a-b6585d58b-ngqw6                                   2/2     Running            0          97m
rook-ceph-mon-a-596987df67-gznl2                                  2/2     Running            0          98m
rook-ceph-mon-b-7978c58f85-ck89v                                  2/2     Running            0          98m
rook-ceph-mon-c-744b7f795-2bbzr                                   2/2     Running            0          97m
rook-ceph-operator-6b4b5c6996-gqgf4                               1/1     Running            0          100m
rook-ceph-osd-0-f88f6c7-7cnvl                                     2/2     Running            0          96m
rook-ceph-osd-1-85d4db94bf-5nfgx                                  2/2     Running            0          96m
rook-ceph-osd-2-5859c47456-7pv4g                                  2/2     Running            0          96m
rook-ceph-osd-prepare-ocs-deviceset-0-data-0f9mqk-t7sg9           0/1     Completed          0          97m
rook-ceph-osd-prepare-ocs-deviceset-1-data-0j9kcj-qktzp           0/1     Completed          0          97m
rook-ceph-osd-prepare-ocs-deviceset-2-data-0xwb29-v5trn           0/1     Completed          0          97m
rook-ceph-rgw-ocs-storagecluster-cephobjectstore-a-777d76cvtc4p   2/2     Running            0          95m
rook-ceph-tools-64945f5fdd-mrzp7                                  1/1     Running            0          96m
 

> nooba operator logs shows SecretOp not found

time="2021-03-11T06:34:55Z" level=error msg="Could not connect to system Connect(): SecretOp not found"
time="2021-03-11T06:35:55Z" level=info msg="Update event detected for ocs-storagecluster-cephcluster (openshift-storage), queuing Reconcile"
time="2021-03-11T06:35:55Z" level=info msg="✅ Exists:  \"ocs-storagecluster-cephcluster\"\n"
time="2021-03-11T06:35:55Z" level=info msg="✅ Exists: NooBaa \"noobaa\"\n"
time="2021-03-11T06:35:55Z" level=info msg="✅ Exists: Service \"noobaa-mgmt\"\n"
time="2021-03-11T06:35:55Z" level=info msg="❌ Not Found: Secret \"noobaa-operator\"\n"
time="2021-03-11T06:35:55Z" level=error msg="Could not connect to system Connect(): SecretOp not found"
time="2021-03-11T06:36:40Z" level=error msg="RPC: closing connection (0xc000342640) &{RPC:0xc00029a140 Address:wss://noobaa-mgmt.openshift-storage.svc.cluster.local:443/rpc/ State:init WS:<nil> PendingRequests:map[] NextRequestID:0 Lock:{state:9 sema:0} ReconnectDelay:3s}"
time="2021-03-11T06:36:40Z" level=warning msg="RPC: RemoveConnection wss://noobaa-mgmt.openshift-storage.svc.cluster.local:443/rpc/ current=0xc000342640 conn=0xc000342640"
time="2021-03-11T06:36:40Z" level=error msg="RPC: Reconnect - got error: failed to websocket dial: failed to send handshake request: Get \"https://noobaa-mgmt.openshift-storage.svc.cluster.local:443/rpc/\": dial tcp 172.30.137.91:443: connect: connection timed out"
time="2021-03-11T06:36:40Z" level=error msg="⚠️  RPC: auth.read_auth() Call failed: RPC: connection (0xc000342640) already closed &{RPC:0xc00029a140 Address:wss://noobaa-mgmt.openshift-storage.svc.cluster.local:443/rpc/ State:closed WS:<nil> PendingRequests:map[] NextRequestID:0 Lock:{state:1 sema:0} ReconnectDelay:3s}"
time="2021-03-11T06:36:40Z" level=info msg="SetPhase: temporary error during phase \"Connecting\"" sys=openshift-storage/noobaa
time="2021-03-11T06:36:40Z" level=warning msg="⏳ Temporary Error: RPC: connection (0xc000342640) already closed &{RPC:0xc00029a140 Address:wss://noobaa-mgmt.openshift-storage.svc.cluster.local:443/rpc/ State:closed WS:<nil> PendingRequests:map[] NextRequestID:0 Lock:{state:1 sema:0} ReconnectDelay:3s}" sys=openshift-storage/noobaa
time="2021-03-11T06:36:40Z" level=warning msg="RPC: GetConnection creating connection to wss://noobaa-mgmt.openshift-storage.svc.cluster.local:443/rpc/ 0xc001822000"
time="2021-03-11T06:36:40Z" level=info msg="RPC: Reconnect (0xc001822000) delay &{RPC:0xc00029a140 Address:wss://noobaa-mgmt.openshift-storage.svc.cluster.local:443/rpc/ State:init WS:<nil> PendingRequests:map[] NextRequestID:0 Lock:{state:1 sema:0} ReconnectDelay:3s}"

Comment 6 Michael Adam 2021-03-12 09:28:51 UTC
http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/vavuthu4-te/vavuthu4-te_20210311T035835/logs/failed_testcase_ocs_logs_1615435333/test_deployment_ocs_logs/ocs_must_gather/quay-io-rhceph-dev-ocs-must-gather-sha256-4079262a73e61da0dae74bedbb268aa2b548ada66e9f9c6dff40ddc9a9dcaff7/noobaa/logs/openshift-storage/noobaa-core-0.log

has this:
~~~~~~~~~~~~~~~~~~~~~
Error: expected to get at least mongo/postgres/agent and uid
/noobaa_init_files//noobaa_init.sh: line 38: cd: /root/node_modules/noobaa-core/: No such file or directory
internal/modules/cjs/loader.js:883
  throw err;
  ^

Error: Cannot find module '/src/upgrade/upgrade_manager.js'
    at Function.Module._resolveFilename (internal/modules/cjs/loader.js:880:15)
    at Function.Module._load (internal/modules/cjs/loader.js:725:27)
    at Function.executeUserEntryPoint [as runMain] (internal/modules/run_main.js:72:12)
    at internal/main/run_main_module.js:17:47 {
  code: 'MODULE_NOT_FOUND',
  requireStack: []
}
upgrade_manager failed with exit code 1
noobaa_init failed with exit code 1. aborting
~~~~~~~~~~~~~~~~~~

Comment 9 Liran Mauda 2021-03-14 13:53:43 UTC
*** Bug 1938548 has been marked as a duplicate of this bug. ***

Comment 12 Vijay Avuthu 2021-03-16 18:07:06 UTC
Verified with build: ocs-operator.v4.8.0-300.ci

Deployment job suceeded and all pods are in running state

Job: https://ocs4-jenkins-csb-ocsqe.apps.ocp4.prod.psi.redhat.com/job/qe-deploy-ocs-cluster/1299/console

pods:

NAME                                                              READY   STATUS      RESTARTS   AGE
csi-cephfsplugin-fqrmp                                            3/3     Running     0          7h37m
csi-cephfsplugin-gs4kd                                            3/3     Running     0          7h37m
csi-cephfsplugin-nns6d                                            3/3     Running     0          7h37m
csi-cephfsplugin-provisioner-6d65459f9b-hrc6h                     6/6     Running     3          7h37m
csi-cephfsplugin-provisioner-6d65459f9b-vqfzz                     6/6     Running     17         7h37m
csi-rbdplugin-fhzgd                                               3/3     Running     0          7h37m
csi-rbdplugin-n9t89                                               3/3     Running     0          7h37m
csi-rbdplugin-provisioner-f468f84bc-bqhjd                         6/6     Running     4          7h37m
csi-rbdplugin-provisioner-f468f84bc-s6tw9                         6/6     Running     20         7h37m
csi-rbdplugin-qxw6n                                               3/3     Running     0          7h37m
noobaa-core-0                                                     1/1     Running     0          7h33m
noobaa-db-pg-0                                                    1/1     Running     0          7h33m
noobaa-endpoint-675655b5f7-5kntn                                  1/1     Running     0          7h32m
noobaa-operator-757ff6f99d-kc6qj                                  1/1     Running     7          7h37m
ocs-metrics-exporter-6959754dc-zt2nf                              1/1     Running     0          7h37m
ocs-operator-56bc6cc8d8-c88ng                                     1/1     Running     4          7h37m
rook-ceph-crashcollector-compute-0-b55548-7jrcg                   1/1     Running     0          7h34m
rook-ceph-crashcollector-compute-1-75dc76664c-c6zzw               1/1     Running     0          7h33m
rook-ceph-crashcollector-compute-2-58f945c758-xtj4r               1/1     Running     0          7h33m
rook-ceph-mds-ocs-storagecluster-cephfilesystem-a-f6bf74ff62h4l   2/2     Running     0          7h33m
rook-ceph-mds-ocs-storagecluster-cephfilesystem-b-66d6bc97b8g8p   2/2     Running     0          7h33m
rook-ceph-mgr-a-6f4c889f6-kcdmn                                   2/2     Running     0          7h35m
rook-ceph-mon-a-6f94fc4d77-9lknc                                  2/2     Running     0          7h36m
rook-ceph-mon-b-78fc7cdb45-r6d2f                                  2/2     Running     0          7h35m
rook-ceph-mon-c-66c9bc8cc7-prrdz                                  2/2     Running     0          7h35m
rook-ceph-operator-fdff96468-9m8kl                                1/1     Running     0          7h37m
rook-ceph-osd-0-64c86b855b-db5jp                                  2/2     Running     0          7h34m
rook-ceph-osd-1-7568844884-cvcsf                                  2/2     Running     0          7h33m
rook-ceph-osd-2-58dc957b9f-r9ltk                                  2/2     Running     0          7h33m
rook-ceph-osd-prepare-ocs-deviceset-0-data-0jqmcv-lllzt           0/1     Completed   0          7h34m
rook-ceph-osd-prepare-ocs-deviceset-1-data-08j7f5-zjgrp           0/1     Completed   0          7h34m
rook-ceph-osd-prepare-ocs-deviceset-2-data-0kccsn-tk7bv           0/1     Completed   0          7h34m
rook-ceph-rgw-ocs-storagecluster-cephobjectstore-a-7ff5bc9ncw9d   2/2     Running     0          7h33m
rook-ceph-tools-6b8d89bc5b-n7lgw                                  1/1     Running     0          7h33m

Comment 17 errata-xmlrpc 2021-08-03 18:15:14 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenShift Container Storage 4.8.0 container images bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:3003


Note You need to log in before you can comment on or make changes to this bug.