Bug 2089296
Summary: | [MS v2] Storage cluster in error phase and 'ocs-provider-qe' addon installation failed with ODF 4.10.2 | |||
---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat OpenShift Data Foundation | Reporter: | Jilju Joy <jijoy> | |
Component: | ocs-operator | Assignee: | Kaustav Majumder <kmajumde> | |
Status: | CLOSED ERRATA | QA Contact: | Jilju Joy <jijoy> | |
Severity: | high | Docs Contact: | ||
Priority: | unspecified | |||
Version: | 4.10 | CC: | ebenahar, kmajumde, madam, muagarwa, nberry, ocs-bugs, odf-bz-bot, omitrani, owasserm, sostapov | |
Target Milestone: | --- | Keywords: | Automation, Regression | |
Target Release: | ODF 4.11.0 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | No Doc Update | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 2096302 (view as bug list) | Environment: | ||
Last Closed: | 2022-08-24 13:53:39 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 2096302 |
Description
Jilju Joy
2022-05-23 11:26:43 UTC
Adding Regression keyword because the installation was working with previous version - Deployer 2.0.1 with ODF 4.10.0 GA. must-gather logs : http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/jijoy-m23-pr/jijoy-m23-pr_20220523T080402/logs/testcases_1653306906/ Since the PR attached is already merged for 4.11 , should the status on the BZ be ON_QA? Looks like this issue was fixed in deployer and nothing was required in the product. According to the chat (https://bugzilla.redhat.com/show_bug.cgi?id=2089296#c3) Jilju mentions that the issue is not even reproducible in 4.10.3 which means we don't require a bug in 4.10 and the BZ targeted for 4.10 can be closed The attached PR is not relevant for this fix and should be removed. Attached PR is for the perf BZ #2068398 For 4.10 we had a different PR/bug BZ #2078715 IMO, we should do this 1. Remove BZ link from the PR. 2. Move the current BZ to managed service and mark it ON_QA 3. Close the 4.10 BZ #2096302 Ohad - FYI - Let me know if this makes sense. It does, with a very small correction. The bug was not fixed in the deployer it was fixed in the product as part of the fix for the pref bug. Because the pref bug had a completely diff fix for 4.10 and 4.11 the entire thing got confusing. Ok. so no need to move this bug to MS and it can be verified along with the perf bug and 4.10 clone is not needed. Verified in version: ODF 4.11.0-104 OCP 4.10.18 $ oc -n openshift-storage get csv NAME DISPLAY VERSION REPLACES PHASE mcg-operator.v4.11.0 NooBaa Operator 4.11.0 mcg-operator.v4.10.4 Succeeded ocs-operator.v4.11.0 OpenShift Container Storage 4.11.0 ocs-operator.v4.10.4 Succeeded ocs-osd-deployer.v2.0.2 OCS OSD Deployer 2.0.2 ocs-osd-deployer.v2.0.1 Succeeded odf-csi-addons-operator.v4.11.0 CSI Addons 4.11.0 odf-csi-addons-operator.v4.10.4 Succeeded odf-operator.v4.11.0 OpenShift Data Foundation 4.11.0 odf-operator.v4.10.2 Succeeded ose-prometheus-operator.4.10.0 Prometheus Operator 4.10.0 ose-prometheus-operator.4.8.0 Succeeded route-monitor-operator.v0.1.422-151be96 Route Monitor Operator 0.1.422-151be96 route-monitor-operator.v0.1.420-b65f47e Succeeded $ rosa list addon -c fbalak-prov27|grep ocs-provider-qe ocs-provider-qe Red Hat OpenShift Data Foundation Managed Service Provider (QE) ready $ ocm list clusters | grep fbalak-prov27 1t3h55itvjj6p8cm5hvmg9v7mjo1lceg fbalak-prov27 https://api.fbalak-prov27.be5a.s1.devshift.org:6443 4.10.18 rosa aws us-east-1 ready $ oc get deployment ocs-osd-controller-manager NAME READY UP-TO-DATE AVAILABLE AGE ocs-osd-controller-manager 1/1 1 1 26h $ oc get pods -o wide | grep ocs-osd-controller-manager ocs-osd-controller-manager-6cbb8889fc-k9bm6 3/3 Running 1 (21h ago) 21h 10.129.2.36 ip-10-0-171-213.ec2.internal <none> <none> $ oc get managedocs managedocs -o yaml apiVersion: ocs.openshift.io/v1alpha1 kind: ManagedOCS metadata: creationTimestamp: "2022-06-27T08:18:43Z" finalizers: - managedocs.ocs.openshift.io generation: 1 name: managedocs namespace: openshift-storage resourceVersion: "340704" uid: 34529f17-0e61-43a9-bceb-fbae15fdbf93 spec: {} status: components: alertmanager: state: Ready prometheus: state: Ready storageCluster: state: Ready reconcileStrategy: strict $ oc get storagecluster NAME AGE PHASE EXTERNAL CREATED AT VERSION ocs-storagecluster 26h Ready 2022-06-27T08:19:01Z $ oc get cephblockpool NAME PHASE cephblockpool-storageconsumer-7c25e752-8ce3-4470-bc36-391d2404417e Ready $ oc get sc NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE gp2 (default) kubernetes.io/aws-ebs Delete WaitForFirstConsumer true 26h gp2-csi ebs.csi.aws.com Delete WaitForFirstConsumer true 26h gp3-csi ebs.csi.aws.com Delete WaitForFirstConsumer true 26h This fix is only required in 4.11 , since there is a different fix for 4.10.z addressed here https://bugzilla.redhat.com/show_bug.cgi?id=2078715. Hence removing the 4.10.z? flag Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.11.0 security, enhancement, & bugfix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:6156 The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days |