Bug 1994261

Summary: odf-operator.v4.9.0-91.ci fails to install with odf-console
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: Vijay Avuthu <vavuthu>
Component: odf-operatorAssignee: Bipul Adhikari <badhikar>
Status: CLOSED ERRATA QA Contact: Raz Tamir <ratamir>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 4.9CC: badhikar, branto, jijoy, jrivera, kramdoss, madam, muagarwa, nigoyal, ocs-bugs, odf-bz-bot, sostapov
Target Milestone: ---Keywords: Automation, TestBlocker
Target Release: ODF 4.9.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: v4.9.0-101.ci Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-12-13 17:44:58 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Vijay Avuthu 2021-08-17 07:27:48 UTC
Description of problem (please be detailed as possible and provide log
snippests):

odf-operator.v4.9.0-91.ci fails to install with odf-console

Version of all relevant components (if applicable):
odf-operator.v4.9.0-91.ci

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
Not able to install OCS using odf-operator

Is there any workaround available to the best of your knowledge?
No

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
1

Can this issue reproducible?
1/1

Can this issue reproduce from the UI?
Not tried

If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1. install OCS using odf-opertor
2. check odf-operator is installed or not
3.


Actual results:

$ oc get csv
NAME                        DISPLAY                       VERSION       REPLACES   PHASE
ocs-operator.v4.9.0-91.ci   OpenShift Container Storage   4.9.0-91.ci              Succeeded
odf-operator.v4.9.0-91.ci   OpenShift Data Foundation     4.9.0-91.ci              Installing



Expected results:

odf-operator should be in Succeeded phase


Additional info:


> csv status

$ oc get csv
NAME                        DISPLAY                       VERSION       REPLACES   PHASE
ocs-operator.v4.9.0-91.ci   OpenShift Container Storage   4.9.0-91.ci              Succeeded
odf-operator.v4.9.0-91.ci   OpenShift Data Foundation     4.9.0-91.ci              Installing


> $ oc describe csv odf-operator.v4.9.0-91.ci
Name:         odf-operator.v4.9.0-91.ci
Namespace:    openshift-storage
Labels:       olm.api.62e2d1ee37777c10=provided
              operators.coreos.com/odf-operator.openshift-storage=


Last Transition Time:  2021-08-17T06:58:09Z
    Last Update Time:      2021-08-17T06:58:09Z
    Message:               installing: waiting for deployment odf-operator-controller-manager to become ready: deployment "odf-operator-controller-manager" not available: Deployment does not have minimum availability.
    Phase:                 Pending
    Reason:                NeedsReinstall
    Last Transition Time:  2021-08-17T06:58:09Z
    Last Update Time:      2021-08-17T06:58:09Z
    Message:               all requirements found, attempting install
    Phase:                 InstallReady
    Reason:                AllRequirementsMet
    Last Transition Time:  2021-08-17T06:58:09Z
    Last Update Time:      2021-08-17T06:58:09Z
    Message:               waiting for install components to report healthy
    Phase:                 Installing
    Reason:                InstallSucceeded
    Last Transition Time:  2021-08-17T06:58:09Z
    Last Update Time:      2021-08-17T06:58:09Z
    Message:               installing: waiting for deployment odf-operator-controller-manager to become ready: deployment "odf-operator-controller-manager" not available: Deployment does not have minimum availability.
    Phase:                 Installing
    Reason:                InstallWaiting
    Last Transition Time:  2021-08-17T06:59:49Z
    Last Update Time:      2021-08-17T06:59:49Z
    Message:               install failed: deployment odf-console not ready before timeout: deployment "odf-console" exceeded its progress deadline
    Phase:                 Failed
    Reason:                InstallCheckFailed


Events:
  Type     Reason               Age                   From                        Message
  ----     ------               ----                  ----                        -------
  Normal   RequirementsUnknown  20m                   operator-lifecycle-manager  requirements not yet checked
  Normal   InstallSucceeded     15m (x3 over 20m)     operator-lifecycle-manager  waiting for install components to report healthy
  Normal   InstallWaiting       12m (x6 over 20m)     operator-lifecycle-manager  installing: waiting for deployment odf-operator-controller-manager to become ready: deployment "odf-operator-controller-manager" not available: Deployment does not have minimum availability.
  Normal   InstallWaiting       11m (x8 over 20m)     operator-lifecycle-manager  installing: waiting for deployment odf-console to become ready: deployment "odf-console" not available: Deployment does not have minimum availability.
  Normal   AllRequirementsMet   10m (x4 over 20m)     operator-lifecycle-manager  all requirements found, attempting install
  Warning  InstallCheckFailed   10m (x2 over 15m)     operator-lifecycle-manager  install timeout
  Normal   NeedsReinstall       10m (x3 over 15m)     operator-lifecycle-manager  installing: waiting for deployment odf-console to become ready: deployment "odf-console" not available: Deployment does not have minimum availability.
  Normal   NeedsReinstall       5m2s (x3 over 8m42s)  operator-lifecycle-manager  installing: waiting for deployment odf-operator-controller-manager to become ready: deployment "odf-operator-controller-manager" not available: Deployment does not have minimum availability.

> pod status

$ oc get pods
NAME                                               READY   STATUS              RESTARTS      AGE
noobaa-operator-5c66fffc54-xp5wc                   1/1     Running             0             26m
ocs-metrics-exporter-5d9c9cdc6d-4chqg              1/1     Running             0             26m
ocs-operator-66f84d6945-v7tzf                      1/1     Running             0             26m
odf-console-7f545f5485-sbftc                       0/2     ContainerCreating   0             26m
odf-operator-controller-manager-564bf8774f-p862z   1/2     CrashLoopBackOff    6 (81s ago)   26m
rook-ceph-operator-544c679545-hx4jq                1/1     Running             0             26m


$ oc describe pod odf-console-7f545f5485-sbftc
Name:           odf-console-7f545f5485-sbftc
Namespace:      openshift-storage
Priority:       0
Node:           compute-1/10.1.160.55
Start Time:     Tue, 17 Aug 2021 12:12:29 +0530
Labels:         app=odf-console
                pod-template-hash=7f545f5485


Events:
  Type     Reason       Age                  From               Message
  ----     ------       ----                 ----               -------
  Normal   Scheduled    25m                  default-scheduler  Successfully assigned openshift-storage/odf-console-7f545f5485-sbftc to compute-1
  Warning  FailedMount  17m (x12 over 25m)   kubelet            MountVolume.SetUp failed for volume "odf-console-serving-cert" : secret "odf-console-serving-cert" not found
  Warning  FailedMount  15m (x13 over 25m)   kubelet            MountVolume.SetUp failed for volume "ibm-console-serving-cert" : secret "ibm-console-serving-cert" not found
  Warning  FailedMount  9m52s (x2 over 23m)  kubelet            Unable to attach or mount volumes: unmounted volumes=[ibm-console-serving-cert odf-console-serving-cert], unattached volumes=[ibm-console-serving-cert odf-console-serving-cert kube-api-access-qmwfb]: timed out waiting for the condition
  Warning  FailedMount  5m20s (x5 over 21m)  kubelet            Unable to attach or mount volumes: unmounted volumes=[odf-console-serving-cert ibm-console-serving-cert], unattached volumes=[odf-console-serving-cert kube-api-access-qmwfb ibm-console-serving-cert]: timed out waiting for the condition


> odf-operator-controller-manager status

$ oc describe pod odf-operator-controller-manager-564bf8774f-p862z
Name:         odf-operator-controller-manager-564bf8774f-p862z
Namespace:    openshift-storage
Priority:     0
Node:         compute-2/10.1.160.36
Start Time:   Tue, 17 Aug 2021 12:12:29 +0530
Labels:       control-plane=controller-manager
              pod-template-hash=564bf8774f


Readiness:  http-get http://:8081/readyz delay=5s timeout=1s period=10s #success=1 #failure=3
    Environment:
      OCS_CSV_NAME:                              <set to the key 'OCS_CSV_NAME' of config map 'odf-operator-manager-config'>                              Optional: false
      IBM_SUBSCRIPTION_NAME:                     <set to the key 'IBM_SUBSCRIPTION_NAME' of config map 'odf-operator-manager-config'>                     Optional: false
      IBM_SUBSCRIPTION_PACKAGE:                  <set to the key 'IBM_SUBSCRIPTION_PACKAGE' of config map 'odf-operator-manager-config'>                  Optional: false
      IBM_SUBSCRIPTION_CHANNEL:                  <set to the key 'IBM_SUBSCRIPTION_CHANNEL' of config map 'odf-operator-manager-config'>                  Optional: false
      IBM_SUBSCRIPTION_STARTINGCSV:              <set to the key 'IBM_SUBSCRIPTION_STARTINGCSV' of config map 'odf-operator-manager-config'>              Optional: false
      IBM_SUBSCRIPTION_CATALOGSOURCE:            <set to the key 'IBM_SUBSCRIPTION_CATALOGSOURCE' of config map 'odf-operator-manager-config'>            Optional: false
      IBM_SUBSCRIPTION_CATALOGSOURCE_NAMESPACE:  <set to the key 'IBM_SUBSCRIPTION_CATALOGSOURCE_NAMESPACE' of config map 'odf-operator-manager-config'>  Optional: false
      OPERATOR_CONDITION_NAME:                   odf-operator.v4.9.0-91.ci
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-l2vd7 (ro)


Events:
  Type     Reason          Age                From               Message
  ----     ------          ----               ----               -------
  Normal   Scheduled       27m                default-scheduler  Successfully assigned openshift-storage/odf-operator-controller-manager-564bf8774f-p862z to compute-2
  Normal   AddedInterface  27m                multus             Add eth0 [10.128.2.14/23] from openshift-sdn
  Normal   Pulling         27m                kubelet            Pulling image "registry.redhat.io/openshift4/ose-kube-rbac-proxy:v4.7.0"
  Normal   Pulling         27m                kubelet            Pulling image "quay.io/rhceph-dev/odf-operator@sha256:a18f308ce4ac09fbd5ea6cf193bba7876dd85a9bd3f6ac721584eac6ad8540fa"
  Normal   Pulled          27m                kubelet            Successfully pulled image "registry.redhat.io/openshift4/ose-kube-rbac-proxy:v4.7.0" in 11.591876072s
  Normal   Created         27m                kubelet            Created container kube-rbac-proxy
  Normal   Started         27m                kubelet            Started container kube-rbac-proxy
  Normal   Pulled          27m                kubelet            Successfully pulled image "quay.io/rhceph-dev/odf-operator@sha256:a18f308ce4ac09fbd5ea6cf193bba7876dd85a9bd3f6ac721584eac6ad8540fa" in 2.376329278s
  Warning  Unhealthy       24m (x2 over 25m)  kubelet            Liveness probe failed: Get "http://10.128.2.14:8081/healthz": dial tcp 10.128.2.14:8081: connect: connection refused
  Warning  ProbeError      24m (x2 over 25m)  kubelet            Liveness probe error: Get "http://10.128.2.14:8081/healthz": dial tcp 10.128.2.14:8081: connect: connection refused
body:
  Normal   Created     24m (x2 over 27m)     kubelet  Created container manager
  Normal   Pulled      24m                   kubelet  Container image "quay.io/rhceph-dev/odf-operator@sha256:a18f308ce4ac09fbd5ea6cf193bba7876dd85a9bd3f6ac721584eac6ad8540fa" already present on machine
  Normal   Started     24m (x2 over 27m)     kubelet  Started container manager
  Warning  Unhealthy   22m (x5 over 25m)     kubelet  Readiness probe failed: Get "http://10.128.2.14:8081/readyz": dial tcp 10.128.2.14:8081: connect: connection refused
  Warning  BackOff     7m10s (x20 over 21m)  kubelet  Back-off restarting failed container
  Warning  ProbeError  2m13s (x26 over 25m)  kubelet  Readiness probe error: Get "http://10.128.2.14:8081/readyz": dial tcp 10.128.2.14:8081: connect: connection refused
body:

> $ oc logs olm-operator-657ccf864b-9gz6t -n openshift-operator-lifecycle-manager

{"level":"error","ts":1629184370.1619947,"logger":"controllers.operator","msg":"Could not update Operator status","request":"/odf-operator.openshift-storage","error":"Operation cannot be fulfilled on operators.o
perators.coreos.com \"odf-operator.openshift-storage\": the object has been modified; please apply your changes to the latest version and try again","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/cont
roller.(*Controller).reconcileHandler\n\t/build/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:298\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWo
rkItem\n\t/build/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:253\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/build/vendor/sigs.k8s.io/c
ontroller-runtime/pkg/internal/controller/controller.go:214"}
time="2021-08-17T07:12:50Z" level=info msg="install strategy successful" csv=odf-operator.v4.9.0-91.ci id=Lxty7 namespace=openshift-storage phase=Installing strategy=deployment
I0817 07:12:50.175951       1 event.go:282] Event(v1.ObjectReference{Kind:"ClusterServiceVersion", Namespace:"openshift-storage", Name:"odf-operator.v4.9.0-91.ci", UID:"0ad2cc8d-b0d7-49ac-a94b-f9035e4d8cd4", API
Version:"operators.coreos.com/v1alpha1", ResourceVersion:"41238", FieldPath:""}): type: 'Normal' reason: 'InstallWaiting' installing: waiting for deployment odf-operator-controller-manager to become ready: deplo
yment "odf-operator-controller-manager" not available: Deployment does not have minimum availability.
time="2021-08-17T07:12:50Z" level=info msg="error updating ClusterServiceVersion status: Operation cannot be fulfilled on clusterserviceversions.operators.coreos.com \"odf-operator.v4.9.0-91.ci\": the object has
 been modified; please apply your changes to the latest version and try again" csv=odf-operator.v4.9.0-91.ci id=2PvQi namespace=openshift-storage phase=Installing
E0817 07:12:50.189929       1 queueinformer_operator.go:290] sync {"update" "openshift-storage/odf-operator.v4.9.0-91.ci"} failed: error updating ClusterServiceVersion status: Operation cannot be fulfilled on cl
usterserviceversions.operators.coreos.com "odf-operator.v4.9.0-91.ci": the object has been modified; please apply your changes to the latest version and try again
{"level":"error","ts":1629184370.2188544,"logger":"controllers.operator","msg":"Could not update Operator status","request":"/odf-operator.openshift-storage","error":"Operation cannot be fulfilled on operators.o
perators.coreos.com \"odf-operator.openshift-storage\": the object has been modified; please apply your changes to the latest version and try again","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/cont
roller.(*Controller).reconcileHandler\n\t/build/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:298\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWo
rkItem\n\t/build/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:253\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/build/vendor/sigs.k8s.io/c
ontroller-runtime/pkg/internal/controller/controller.go:214"}


time="2021-08-17T07:13:09Z" level=info msg="install strategy successful" csv=odf-operator.v4.9.0-91.ci id=lJfYb namespace=openshift-storage phase=Installing strategy=deployment
I0817 07:13:09.435836       1 event.go:282] Event(v1.ObjectReference{Kind:"ClusterServiceVersion", Namespace:"openshift-storage", Name:"odf-operator.v4.9.0-91.ci", UID:"0ad2cc8d-b0d7-49ac-a94b-f9035e4d8cd4", APIVersion:"operators.coreos.com/v1alpha1", ResourceVersion:"41241", FieldPath:""}): type: 'Warning' reason: 'InstallCheckFailed' install failed: deployment odf-console not ready before timeout: deployment "odf-console" exceeded its progress deadline
time="2021-08-17T07:13:09Z" level=warning msg="needs reinstall: deployment odf-console not ready before timeout: deployment \"odf-console\" exceeded its progress deadline" csv=odf-operator.v4.9.0-91.ci id=trPZV namespace=openshift-storage phase=Failed strategy=deployment


Job: https://ocs4-jenkins-csb-ocsqe.apps.ocp4.prod.psi.redhat.com/job/qe-deploy-ocs-cluster/5278/console

Comment 7 Vijay Avuthu 2021-08-19 15:40:45 UTC
Update:
============

> odf-operator is installed successfully on build odf-operator.v4.9.0-101.ci
$ oc get csv
NAME                         DISPLAY                       VERSION        REPLACES   PHASE
ocs-operator.v4.9.0-101.ci   OpenShift Container Storage   4.9.0-101.ci              Installing
odf-operator.v4.9.0-101.ci   OpenShift Data Foundation     4.9.0-101.ci              Succeeded

> odf-console pod is running

$ oc get pods
NAME                                                              READY   STATUS      RESTARTS   AGE
csi-cephfsplugin-mm6qx                                            3/3     Running     0          3h24m
csi-cephfsplugin-mt69j                                            3/3     Running     0          3h24m
csi-cephfsplugin-mzznr                                            3/3     Running     0          3h24m
csi-cephfsplugin-provisioner-54fbb98c8f-454b5                     6/6     Running     0          3h23m
csi-cephfsplugin-provisioner-54fbb98c8f-zscpz                     6/6     Running     0          3h23m
csi-rbdplugin-gv8dh                                               3/3     Running     0          3h24m
csi-rbdplugin-p86bm                                               3/3     Running     0          3h24m
csi-rbdplugin-provisioner-84ccc64b48-n8xct                        6/6     Running     0          3h24m
csi-rbdplugin-provisioner-84ccc64b48-wjbmx                        6/6     Running     0          3h24m
csi-rbdplugin-w4k2k                                               3/3     Running     0          3h24m
noobaa-core-0                                                     1/1     Running     0          3h19m
noobaa-db-pg-0                                                    1/1     Running     0          3h19m
noobaa-endpoint-7f74df9dbc-smhq4                                  1/1     Running     0          3h17m
noobaa-operator-6498bbd74d-glq7c                                  1/1     Running     0          3h25m
ocs-metrics-exporter-6779d6fb6d-cg9n4                             1/1     Running     0          3h25m
ocs-operator-9677cf7cc-f8wvp                                      0/1     Running     0          3h25m
odf-console-58bc96b6b7-qvc2s                                      2/2     Running     0          3h25m
odf-operator-controller-manager-7f5496b9c9-lm9p4                  2/2     Running     0          3h25m


> jenkins job: https://ocs4-jenkins-csb-ocsqe.apps.ocp4.prod.psi.redhat.com/job/qe-deploy-ocs-cluster/5355/

marking as Verified as odf-operator installed successfully

Comment 13 errata-xmlrpc 2021-12-13 17:44:58 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat OpenShift Data Foundation 4.9.0 enhancement, security, and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:5086