+++ This bug was initially created as a clone of Bug #1968423 +++ Description of problem: Once the operator is installed in the cluster I try to delete it using `oc delete ns/assisted-installer` and the command never ends. Version-Release number of selected component (if applicable): How reproducible: I tried only once, this is the only env I have and I can't redeploy the operator without destroying the whole cluster. Steps to Reproduce: 1.install the operator 2.`oc delete ns/assisted-installer` Actual results: The command never ends. $ oc describe ns/assisted-installer: ``` Name: assisted-installer Labels: kubernetes.io/metadata.name=assisted-installer name=assisted-installer Annotations: openshift.io/sa.scc.mcs: s0:c25,c20 openshift.io/sa.scc.supplemental-groups: 1000640000/10000 openshift.io/sa.scc.uid-range: 1000640000/10000 Status: Terminating No resource quota. No LimitRange resource. ``` Expected results: The namespace should be deleted. Additional info: --- Additional comment from ybettan on 20210607T11:45:23 # oc get ns/assisted-installer -o yaml apiVersion: v1 kind: Namespace metadata: annotations: kubectl.kubernetes.io/last-applied-configuration: | {"apiVersion":"v1","kind":"Namespace","metadata":{"annotations":{},"labels":{"name":"assisted-installer"},"name":"assisted-installer"}} openshift.io/sa.scc.mcs: s0:c25,c20 openshift.io/sa.scc.supplemental-groups: 1000640000/10000 openshift.io/sa.scc.uid-range: 1000640000/10000 creationTimestamp: "2021-06-07T06:04:52Z" deletionTimestamp: "2021-06-07T09:49:29Z" labels: kubernetes.io/metadata.name: assisted-installer name: assisted-installer name: assisted-installer resourceVersion: "319665" uid: 40edd6ab-8361-47fd-9740-c7aaff74682c spec: finalizers: - kubernetes status: conditions: - lastTransitionTime: "2021-06-07T09:49:41Z" message: All resources successfully discovered reason: ResourcesDiscovered status: "False" type: NamespaceDeletionDiscoveryFailure - lastTransitionTime: "2021-06-07T09:49:41Z" message: All legacy kube types successfully parsed reason: ParsedGroupVersions status: "False" type: NamespaceDeletionGroupVersionParsingFailure - lastTransitionTime: "2021-06-07T09:50:07Z" message: All content successfully deleted, may be waiting on finalization reason: ContentDeleted status: "False" type: NamespaceDeletionContentFailure - lastTransitionTime: "2021-06-07T09:49:41Z" message: 'Some resources are remaining: agentclusterinstalls.extensions.hive.openshift.io has 1 resource instances, agents.agent-install.openshift.io has 1 resource instances, clusterdeployments.hive.openshift.io has 1 resource instances' reason: SomeResourcesRemain status: "True" type: NamespaceContentRemaining - lastTransitionTime: "2021-06-07T09:49:41Z" message: 'Some content in the namespace has finalizers remaining: agent.agent-install.openshift.io/ai-deprovision in 1 resource instances, agentclusterinstall.agent-install.openshift.io/ai-deprovision in 1 resource instances, clusterdeployments.agent-install.openshift.io/ai-deprovision in 1 resource instances' reason: SomeFinalizersRemain status: "True" type: NamespaceFinalizersRemaining phase: Terminating --- Additional comment from fpercoco on 20210609T06:09:42 Looks like this is happening because the assisted-service pod may have been deleted before the rest of the resources. This causes the rest of the resources to be stuck as the finalizers may not complete. > I tried only once, this is the only env I have and I can't redeploy the operator without destroying the whole cluster. You should be able to remove the finalizers from the various resources. For 4.8 == We may not be able to provide a better user experience for this case in 4.8.0. Instead, we could document how to remove the finalizers and what a "healthy" cleanup workflow looks like: For 4.9 == we may want to think about something that would provide a better user experience. The above being said I think this is not specific to the operator/platform but rather to the integration with Hive and the other CRs. --- Additional comment from mfilanov on 20210609T10:20:46 How does hive handle it? they probably have the same issue @dgoodwin ? --- Additional comment from dgoodwin on 20210609T11:19:02 Hive doesn't do anything to handle this, I believe this is working as expected. Finalizers block deletion until the controllers that placed them remove them. If the controllers are dead, the finalizers can't stay around, I don't think there's any viable option there. Best you could do would be to document a clean teardown process.
need to be verified on a downstream ACM version. ACM has not cut us a new release image. This bug should not block the release.
bug needs to be verified on downstream acm. should not block the release.
Moving bug off openshift product to RHACM. this bug will need to be verified on a downtream RHACM with Assisted service image.
@frolland Do you know why this clone is on the ACM product and not AI's component under OCP?
Looks like @bjacot moved it onto RHACM. Assuming that was done because it needs to be verified on RHACM but this update hasn't yet been included in an advisory. I think this should still be against the AI component under OCP. What do you think @bjacot
Since Crystal mentioned this was fixed in ACM 2.3.1 (ie. the GA), can the bugzilla be closed?
G2Bsync 925985645 comment CrystalChun Thu, 23 Sep 2021 16:49:56 UTC G2Bsync Was already included in ACM 2.3 GA Picked up in https://github.com/open-cluster-management/backlog/issues/13873
It doesn't look like there's a way to reproduce this downstream. RHACM has AI bundled in it. Deleting namespace of RHACM requires more steps than just oc delete ns/rhacm (see RHACM documentation here: https://github.com/open-cluster-management/rhacm-docs/blob/64223d8f987ed0a1d9e3d886e3da93cec1dd0fb9/install/uninstall.adoc). Deleting RHACM through the appropriate steps removes AI. According to @bjacot who has tried it, this method works.
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days