Bug 2170859
| Summary: | [ODFMS] osd-deployer.v2.0.10 pods stuck in installing state intermittently during rosa upgrade from 4.10.47 to rosa 4.11.25 on consumer cluster | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat OpenShift Data Foundation | Reporter: | suchita <sgatfane> |
| Component: | odf-managed-service | Assignee: | Ohad <omitrani> |
| Status: | CLOSED NOTABUG | QA Contact: | Neha Berry <nberry> |
| Severity: | low | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 4.10 | CC: | ocs-bugs, odf-bz-bot, resoni |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2023-05-04 05:54:14 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Verified in V2.0.12 Qualification :
$ oc get pods
NAME READY STATUS RESTARTS AGE
addon-ocs-consumer-qe-catalog-rhxb2 1/1 Running 0 69m
alertmanager-managed-ocs-alertmanager-0 2/2 Running 0 45m
csi-addons-controller-manager-759b488df-5n6fq 2/2 Running 0 48m
csi-cephfsplugin-7m59s 2/2 Running 4 3h18m
csi-cephfsplugin-fg2jg 2/2 Running 2 3h18m
csi-cephfsplugin-nvqfx 2/2 Running 2 3h18m
csi-cephfsplugin-provisioner-5d6b768994-8962l 5/5 Running 0 59m
csi-cephfsplugin-provisioner-5d6b768994-jbn8l 5/5 Running 0 45m
csi-rbdplugin-96h84 3/3 Running 3 3h18m
csi-rbdplugin-provisioner-65477c4f5-54vdg 6/6 Running 0 55m
csi-rbdplugin-provisioner-65477c4f5-zg9r8 6/6 Running 0 45m
csi-rbdplugin-qdjl7 3/3 Running 6 3h18m
csi-rbdplugin-tfnfb 3/3 Running 3 3h18m
ocs-metrics-exporter-5dd96c885b-l8z8t 1/1 Running 0 48m
ocs-operator-6888799d6b-2jj65 1/1 Running 0 45m
ocs-osd-aws-data-gather-5bd59fb6c8-ph82z 1/1 Running 0 48m
ocs-osd-controller-manager-5d9694754c-swzx6 3/3 Running 1 (54s ago) 45m
odf-console-57b8476cd4-6pcqz 1/1 Running 0 59m
odf-operator-controller-manager-6f44676f4f-bqpzw 2/2 Running 0 69m
prometheus-managed-ocs-prometheus-0 3/3 Running 0 59m
prometheus-operator-8547cc9f89-c6dm9 1/1 Running 0 48m
redhat-operators-kkbcz 1/1 Running 0 45m
rook-ceph-operator-548b87d44b-98ph5 1/1 Running 0 55m
rook-ceph-tools-7c8c77bd96-gtwp9 1/1 Running 0 55m
[jenkins@temp-jagent-sgatfane-10cma auth]$ oc get managedocs
NAME AGE
managedocs 3h20m
$ oc get managedocs -o yaml
apiVersion: v1
items:
- apiVersion: ocs.openshift.io/v1alpha1
kind: ManagedOCS
metadata:
creationTimestamp: "2023-04-10T15:28:39Z"
finalizers:
- managedocs.ocs.openshift.io
generation: 1
name: managedocs
namespace: openshift-storage
resourceVersion: "422419"
uid: 324ec872-b91f-4770-99c9-aa16987e2e30
spec: {}
status:
components:
alertmanager:
state: Ready
prometheus:
state: Ready
storageCluster:
state: Ready
reconcileStrategy: strict
kind: List
metadata:
resourceVersion: ""
selfLink: ""
$ oc get storagecluster
NAME AGE PHASE EXTERNAL CREATED AT VERSION
ocs-storagecluster 3h20m Ready true 2023-04-10T15:28:41Z
$ oc get csv
NAME DISPLAY VERSION REPLACES PHASE
mcg-operator.v4.10.11 NooBaa Operator 4.10.11 mcg-operator.v4.10.10 Succeeded
observability-operator.v0.0.20 Observability Operator 0.0.20 observability-operator.v0.0.19 Succeeded
ocs-operator.v4.10.9 OpenShift Container Storage 4.10.9 ocs-operator.v4.10.8 Succeeded
ocs-osd-deployer.v2.0.12 OCS OSD Deployer 2.0.12 ocs-osd-deployer.v2.0.11 Succeeded
odf-csi-addons-operator.v4.10.9 CSI Addons 4.10.9 odf-csi-addons-operator.v4.10.8 Succeeded
odf-operator.v4.10.9 OpenShift Data Foundation 4.10.9 odf-operator.v4.10.8 Succeeded
ose-prometheus-operator.4.10.0 Prometheus Operator 4.10.0 ose-prometheus-operator.4.8.0 Succeeded
route-monitor-operator.v0.1.493-a866e7c Route Monitor Operator 0.1.493-a866e7c route-monitor-operator.v0.1.489-7d9fe90 Succeeded
$ oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.11.25 True False 58m Cluster version is 4.11.25
Fixed in version 2.0.12 |
Description of problem: During rosa upgrade on the COnsumer cluster from 4.10.47 to rosa 4.11.25, osd-deployer.v2.0.10 pods stuck in installing state. This is an intermittent issue and not observed on all upgraded clusters. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. Deploy consumer cluster with ROSA 4.10.47 and ocs-consumer addon 2. Upgrade rosa from 4.10.47 to 4.11.25 3. Actual results: ocs-deplyer csv stuck in installing state , intermittently Expected results: All csv should reach in successful state Additional info: $ oc get csv NAME DISPLAY VERSION REPLACES PHASE mcg-operator.v4.10.9 NooBaa Operator 4.10.9 mcg-operator.v4.10.8 Succeeded observability-operator.v0.0.20 Observability Operator 0.0.20 observability-operator.v0.0.19 Succeeded ocs-operator.v4.10.5 OpenShift Container Storage 4.10.5 ocs-operator.v4.10.4 Succeeded ocs-osd-deployer.v2.0.10 OCS OSD Deployer 2.0.10 ocs-osd-deployer.v2.0.9 Installing odf-csi-addons-operator.v4.10.5 CSI Addons 4.10.5 odf-csi-addons-operator.v4.10.4 Succeeded odf-operator.v4.10.5 OpenShift Data Foundation 4.10.5 odf-operator.v4.10.4 Succeeded ose-prometheus-operator.4.10.0 Prometheus Operator 4.10.0 ose-prometheus-operator.4.8.0 Succeeded route-monitor-operator.v0.1.461-dbddf1f Route Monitor Operator 0.1.461-dbddf1f route-monitor-operator.v0.1.456-02ea942 Succeeded The managedocs yaml shows all the 3 are Ready status: components: alertmanager: state: Ready prometheus: state: Ready storageCluster: state: Ready status in deployed csv is `installing: waiting for deployment ocs-osd-controller-manager to become ready: deployment "ocs-osd-controller-manager" not available: Deployment does not have minimum availability.` csv describe error: Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning InstallCheckFailed 84s (x80 over 3h48m) operator-lifecycle-manager install timeout Workaround: respin the ocs-osd-controller-manager