Description of problem: After Uninstallation of provider QE add-on when add-on installation initiated, installation stuck in installing state and failed to install any resources in openshift-storage space. Version-Release number of selected component (if applicable): OCP 4.0.24 oc get csv -n openshift-storage -o json ocs-operator.v4.10.0 | jq '.metadata.labels["full_version"]' "4.10.0-206" oc get csv NAME DISPLAY VERSION REPLACES PHASE mcg-operator.v4.10.0 NooBaa Operator 4.10.0 Succeeded ocs-operator.v4.10.0 OpenShift Container Storage 4.10.0 Succeeded ocs-osd-deployer.v2.0.0 OCS OSD Deployer 2.0.0 Succeeded odf-operator.v4.10.0 OpenShift Data Foundation 4.10.0 Succeeded ose-prometheus-operator.4.8.0 Prometheus Operator 4.8.0 Succeeded route-monitor-operator.v0.1.406-54ff884 Route Monitor Operator 0.1.406-54ff884 route-monitor-operator.v0.1.404-e29b74b Succeeded oc describe csv ocs-osd-deployer.v2.0.0|grep -i image Mediatype: image/svg+xml Image: gcr.io/kubebuilder/kube-rbac-proxy:v0.5.0 Image: quay.io/osd-addons/ocs-osd-deployer:2.0.0-1 Image: quay.io/osd-addons/ocs-osd-deployer:2.0.0-1 How reproducible: 2/2 Steps to Reproduce: 1. Create provider-consumer setup 2. uninstall consumer add-on 3. uninstall provider add-on 4. Ensure install button is available and no state in the add-on detail page of the provider-qe add-on 5. Reinstall add-on again Actual results: openshift-storage namespace exist however no pods are running in that space Expected results: ODF should install successfully Additional info: Discussion on this before raising BZ in engg room: https://chat.google.com/room/AAAASHA9vWs/gtIYwCL0fn0 below are QE observation during this BZ reproducer: 1. Uninstalled add-on from OCM UI and it succeeded. 2. UI now gives the option to Install again. Tried to install but nothing starts observation - it is seen that the namespace deletion was still stuck from #1 ---------------------------------------------------------- status: conditions: - lastTransitionTime: "2022-03-24T12:02:00Z" message: All resources successfully discovered reason: ResourcesDiscovered status: "False" type: NamespaceDeletionDiscoveryFailure - lastTransitionTime: "2022-03-24T12:02:00Z" message: All legacy kube types successfully parsed reason: ParsedGroupVersions status: "False" type: NamespaceDeletionGroupVersionParsingFailure - lastTransitionTime: "2022-03-24T12:02:00Z" message: All content successfully deleted, may be waiting on finalization reason: ContentDeleted status: "False" type: NamespaceDeletionContentFailure - lastTransitionTime: "2022-03-24T12:02:00Z" message: 'Some resources are remaining: configmaps. has 1 resource instances' reason: SomeResourcesRemain status: "True" type: NamespaceContentRemaining - lastTransitionTime: "2022-03-24T12:02:00Z" message: 'Some content in the namespace has finalizers remaining: ceph.rook.io/disaster-protection in 1 resource instances' reason: SomeFinalizersRemain status: "True" type: NamespaceFinalizersRemaining phase: Terminating ``` output after nearly 30 min of reinstallation: oc get secret -n openshift-storage No resources were found in openshift-storage namespace. ➜ 203 oc get cm -n openshift-storage NAME DATA AGE rook-ceph-mon-endpoints 4 5h57m ---------------------------------------------------- oc get cm -o yaml apiVersion: v1 items: - apiVersion: v1 data: csi-cluster-config-json: '[{"clusterID":"openshift-storage","monitors":["10.0.164.129:6789","10.0.133.3:6789","10.0.210.56:6789"]}]' data: a=10.0.164.129:6789,b=10.0.133.3:6789,c=10.0.210.56:6789 mapping: '{"node":{"a":{"Name":"ip-10-0-164-129.us-east-2.compute.internal","Hostname":"ip-10-0-164-129.us-east-2.compute.internal","Address":"10.0.164.129"},"b":{"Name":"ip-10-0-133-3.us-east-2.compute.internal","Hostname":"ip-10-0-133-3.us-east-2.compute.internal","Address":"10.0.133.3"},"c":{"Name":"ip-10-0-210-56.us-east-2.compute.internal","Hostname":"ip-10-0-210-56.us-east-2.compute.internal","Address":"10.0.210.56"}}}' maxMonId: "2" kind: ConfigMap metadata: creationTimestamp: "2022-03-24T07:03:53Z" deletionGracePeriodSeconds: 0 deletionTimestamp: "2022-03-24T11:46:57Z" finalizers: - ceph.rook.io/disaster-protection name: rook-ceph-mon-endpoints namespace: openshift-storage ownerReferences: - apiVersion: ceph.rook.io/v1 blockOwnerDeletion: true controller: true kind: CephCluster name: ocs-storagecluster-cephcluster uid: 344a4a26-be67-47b7-8060-de8de9102d18 resourceVersion: "363649" uid: 28f9a725-39b3-467b-a1aa-24f1a72ba0e7 kind: List metadata: resourceVersion: "" selfLink: "" ------------------------------------------------------ cephCluster owns this configMap and cephCluster got deleted by the configMap didn't get deleted. ownerReferences: - apiVersion: ceph.rook.io/v1 blockOwnerDeletion: true controller: true kind: CephCluster name: ocs-storagecluster-cephcluster uid: 344a4a26-be67-47b7-8060-de8de9102d18 ------------------------------------------------------ Initial analysis from Engineering: [Dhruv Bindra] Two issues over here 1. Namespace deletion blocked by a finalizer on configMap that is owned by cephCluster(cephCluster was deleted but configMap didn't) 2. The namespace was not deleted even though the status of the addon was uninstalled
This should have a very low priority as this will have no impact on customers. Provider add-on will never be installed on an existing cluster, providers clusters are a package of a newly provisioned cluster and the addon. As of the above, setting the priority and severity to Low
As per the current release features, the provider is installed as a service; not as an addon install. Hence individually provider addon and reinstall is a non-reproducible scenario with current releases. Hence This bug is no more a valid-applicable bug. Hence Closing this BZ as Not a bug.