Description of problem: When Provider is uninstalled before consumer then consumer storage cluster state change to error and ocs-osd-deployer show the installing status however the consumer add-on show `ready` status $ rosa list add-on -c sgatfane-c1-am | grep ocs-consumer-qe ocs-consumer-qe Red Hat OpenShift Data Foundation Managed Service Consumer (QE) ready Version-Release number of selected component (if applicable): OPENSHIFT_VERSION : 4.10.9 ========CSV ====== NAME DISPLAY VERSION REPLACES PHASE mcg-operator.v4.10.0 NooBaa Operator 4.10.0 Succeeded ocs-operator.v4.10.0 OpenShift Container Storage 4.10.0 Succeeded ocs-osd-deployer.v2.0.1 OCS OSD Deployer 2.0.1 ocs-osd-deployer.v2.0.0 Succeeded odf-csi-addons-operator.v4.10.0 CSI Addons 4.10.0 Succeeded odf-operator.v4.10.0 OpenShift Data Foundation 4.10.0 Succeeded ose-prometheus-operator.4.8.0 Prometheus Operator 4.8.0 Succeeded How reproducible: 1/1 Steps to Reproduce: 1. create appliance model provider cluster using the `rosa create service ...` command 2. Create a consumer cluster with a consumer add-on installed on it 3. Ensure that provider and consumer is in connected and add-on ready state 4. Uninstall provider service using `rosa delete service --id= ` Actual results: consumer add-on show `ready` status Expected results: Consumer addon should show an appropriate error/failed status should Additional info: Mon Apr 25 15:38:34 UTC 2022 -------------- ========CSV ====== NAME DISPLAY VERSION REPLACES PHASE mcg-operator.v4.10.0 NooBaa Operator 4.10.0 Succeeded ocs-operator.v4.10.0 OpenShift Container Storage 4.10.0 Succeeded ocs-osd-deployer.v2.0.1 OCS OSD Deployer 2.0.1 ocs-osd-deployer.v2.0.0 Installing odf-csi-addons-operator.v4.10.0 CSI Addons 4.10.0 Succeeded odf-operator.v4.10.0 OpenShift Data Foundation 4.10.0 Succeeded ose-prometheus-operator.4.8.0 Prometheus Operator 4.8.0 Succeeded route-monitor-operator.v0.1.408-c2256a2 Route Monitor Operator 0.1.408-c2256a2 route-monitor-operator.v0.1.406-54ff884 Succeeded -------------- =======PODS ====== NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES alertmanager-managed-ocs-alertmanager-0 2/2 Running 0 9h 10.128.2.48 ip-10-0-128-13.ec2.internal <none> <none> alertmanager-managed-ocs-alertmanager-1 2/2 Running 0 9h 10.128.2.50 ip-10-0-128-13.ec2.internal <none> <none> alertmanager-managed-ocs-alertmanager-2 2/2 Running 0 9h 10.128.2.51 ip-10-0-128-13.ec2.internal <none> <none> csi-addons-controller-manager-6849d8f79d-t6z4j 2/2 Running 0 9h 10.129.2.8 ip-10-0-158-82.ec2.internal <none> <none> csi-cephfsplugin-dk6n9 3/3 Running 0 9h 10.0.128.13 ip-10-0-128-13.ec2.internal <none> <none> csi-cephfsplugin-fbmlc 3/3 Running 3 9h 10.0.166.153 ip-10-0-166-153.ec2.internal <none> <none> csi-cephfsplugin-provisioner-7ccffbd5d5-hdz4t 6/6 Running 0 9h 10.129.2.14 ip-10-0-158-82.ec2.internal <none> <none> csi-cephfsplugin-provisioner-7ccffbd5d5-zdxr5 6/6 Running 0 9h 10.128.2.57 ip-10-0-128-13.ec2.internal <none> <none> csi-cephfsplugin-sh9hx 3/3 Running 0 9h 10.0.158.82 ip-10-0-158-82.ec2.internal <none> <none> csi-rbdplugin-75h2m 4/4 Running 0 9h 10.0.128.13 ip-10-0-128-13.ec2.internal <none> <none> csi-rbdplugin-9lbnf 4/4 Running 4 9h 10.0.166.153 ip-10-0-166-153.ec2.internal <none> <none> csi-rbdplugin-h9tsx 4/4 Running 0 9h 10.0.158.82 ip-10-0-158-82.ec2.internal <none> <none> csi-rbdplugin-provisioner-6455fd4867-2k5lm 7/7 Running 0 9h 10.129.2.22 ip-10-0-158-82.ec2.internal <none> <none> csi-rbdplugin-provisioner-6455fd4867-wbdwq 7/7 Running 0 9h 10.128.2.58 ip-10-0-128-13.ec2.internal <none> <none> ocs-metrics-exporter-b654d74b5-gskmc 1/1 Running 0 9h 10.128.2.54 ip-10-0-128-13.ec2.internal <none> <none> ocs-operator-7dfcf95b4d-s8lpn 1/1 Running 0 9h 10.128.2.55 ip-10-0-128-13.ec2.internal <none> <none> ocs-osd-controller-manager-7bd447f6d7-j8lkb 2/3 Running 0 9h 10.128.2.43 ip-10-0-128-13.ec2.internal <none> <none> odf-console-6d676ff745-mm5rv 1/1 Running 0 9h 10.128.2.44 ip-10-0-128-13.ec2.internal <none> <none> odf-operator-controller-manager-54c94476f4-2jds7 2/2 Running 0 9h 10.128.2.38 ip-10-0-128-13.ec2.internal <none> <none> prometheus-managed-ocs-prometheus-0 2/2 Running 1 (9h ago) 9h 10.128.2.45 ip-10-0-128-13.ec2.internal <none> <none> prometheus-operator-6b8cbc545f-jkzq6 1/1 Running 0 9h 10.128.2.37 ip-10-0-128-13.ec2.internal <none> <none> rook-ceph-operator-7cd868ddfc-7j7qn 1/1 Running 0 9h 10.128.2.53 ip-10-0-128-13.ec2.internal <none> <none> rook-ceph-tools-56b46d6f99-ppgkj 1/1 Running 0 9h 10.0.128.13 ip-10-0-128-13.ec2.internal <none> <none> -------------- ======= machine ========== NAMESPACE NAME PHASE TYPE REGION ZONE AGE NODE PROVIDERID STATE openshift-machine-api sgatfane-c1-am-lw9bp-infra-us-east-1a-h85c7 Running r5.xlarge us-east-1 us-east-1a 9h ip-10-0-136-17.ec2.internal aws:///us-east-1a/i-00de9ba0043b14865 running openshift-machine-api sgatfane-c1-am-lw9bp-infra-us-east-1b-pdsp5 Running r5.xlarge us-east-1 us-east-1b 9h ip-10-0-149-242.ec2.internal aws:///us-east-1b/i-0bb38329f21344aa6 running openshift-machine-api sgatfane-c1-am-lw9bp-infra-us-east-1c-gt992 Running r5.xlarge us-east-1 us-east-1c 9h ip-10-0-167-255.ec2.internal aws:///us-east-1c/i-0d8f21d242181e971 running openshift-machine-api sgatfane-c1-am-lw9bp-master-0 Running m5.2xlarge us-east-1 us-east-1a 9h ip-10-0-142-95.ec2.internal aws:///us-east-1a/i-027db5a2f8a0ce506 running openshift-machine-api sgatfane-c1-am-lw9bp-master-1 Running m5.2xlarge us-east-1 us-east-1b 9h ip-10-0-155-32.ec2.internal aws:///us-east-1b/i-05c0083b8549c46d9 running openshift-machine-api sgatfane-c1-am-lw9bp-master-2 Running m5.2xlarge us-east-1 us-east-1c 9h ip-10-0-164-196.ec2.internal aws:///us-east-1c/i-038065dd3f8ba89d2 running openshift-machine-api sgatfane-c1-am-lw9bp-worker-us-east-1a-6js9n Running m5.2xlarge us-east-1 us-east-1a 9h ip-10-0-128-13.ec2.internal aws:///us-east-1a/i-0147c6e5e10dac14b running openshift-machine-api sgatfane-c1-am-lw9bp-worker-us-east-1b-rhgrp Running m5.2xlarge us-east-1 us-east-1b 9h ip-10-0-158-82.ec2.internal aws:///us-east-1b/i-049cfbdfe3446cda3 running openshift-machine-api sgatfane-c1-am-lw9bp-worker-us-east-1c-4tmnm Running m5.2xlarge us-east-1 us-east-1c 9h ip-10-0-166-153.ec2.internal aws:///us-east-1c/i-0be9dd1ff69bf6f0c running -------------- ======= PVC ========== -------------- ======= storagecluster ========== NAME AGE PHASE EXTERNAL CREATED AT VERSION ocs-storagecluster 9h Error true 2022-04-25T06:29:59Z -------------- ======= cephcluster ========== NAME DATADIRHOSTPATH MONCOUNT AGE PHASE MESSAGE HEALTH EXTERNAL ocs-storagecluster-cephcluster 9h Connected Failed to configure external ceph cluster HEALTH_ERR true
Any alerts from consumer? Was there any pods using storage from provider?
(In reply to Sahina Bose from comment #1) > Any alerts from consumer? Was there any pods using storage from provider? No SendGrid alert was received. pagerduty was configured with this cluster so not noticed any alert for this.
Re-test this issue with the latest build
I tried to reproduce this scenario for verification. The exact scenario is not reproducible. As per Comment#6, even force delete of project openshift-storage and force delete of cluster using ocm api command doesn't help to delete the provider cluster. Provider cluster stuck in uninstalling state. $ rosa list cluster ID NAME STATE TOPOLOGY 23nk3phngdv3i1m8alik7niqttfde635 sgatfane-mp13 uninstalling Classic (STS) 23nk4lkjovhv8qm23v4fpommvofk3src sgatfane-cmm13 ready Classic (STS) Force delete pf openshift-storage namespace, disable access of cluster : 'Unable to connect to the server: Service Unavailable' but the provider cluster is stuck in uninstalling state. Consumer addon status does remain in a ready state Consumer state when provider stuck in the uninstalling state: $ oc get csv NAME DISPLAY VERSION REPLACES PHASE mcg-operator.v4.10.12 NooBaa Operator 4.10.12 mcg-operator.v4.10.11 Succeeded observability-operator.v0.0.20 Observability Operator 0.0.20 observability-operator.v0.0.19 Succeeded ocs-operator.v4.10.9 OpenShift Container Storage 4.10.9 ocs-operator.v4.10.8 Succeeded ocs-osd-deployer.v2.0.13 OCS OSD Deployer 2.0.13 ocs-osd-deployer.v2.0.12 Installing odf-csi-addons-operator.v4.10.9 CSI Addons 4.10.9 odf-csi-addons-operator.v4.10.8 Succeeded odf-operator.v4.10.9 OpenShift Data Foundation 4.10.9 odf-operator.v4.10.8 Succeeded ose-prometheus-operator.4.10.0 Prometheus Operator 4.10.0 ose-prometheus-operator.4.8.0 Succeeded route-monitor-operator.v0.1.500-6152b76 Route Monitor Operator 0.1.500-6152b76 route-monitor-operator.v0.1.498-e33e391 Succeeded $ oc get cephcluster NAME DATADIRHOSTPATH MONCOUNT AGE PHASE MESSAGE HEALTH EXTERNAL ocs-storagecluster-cephcluster 12h Connected Failed to configure external ceph cluster HEALTH_ERR true $ oc get storagecluster NAME AGE PHASE EXTERNAL CREATED AT VERSION ocs-storagecluster 14h Error true 2023-05-15T03:21:54Z as per multiple comments in this BZ, this is states are expected Provider uninstall when consumer uninstalled. Marking this BZ as verified. Hence marking this as verified.