Bug 1881874 - openshift-cnv namespace is getting stuck if the user tries to delete it while CNV is running
Summary: openshift-cnv namespace is getting stuck if the user tries to delete it while...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Container Native Virtualization (CNV)
Classification: Red Hat
Component: Installation
Version: 2.4.1
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 2.6.0
Assignee: Simone Tiraboschi
QA Contact: Satyajit Bulage
URL:
Whiteboard:
: 1901642 1904147 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-09-23 09:35 UTC by Simone Tiraboschi
Modified: 2024-03-25 16:34 UTC (History)
9 users (show)

Fixed In Version: hco-bundle-registry-container-v2.6.0-272
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-03-10 11:18:00 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github kubevirt hyperconverged-cluster-operator pull 949 0 None closed Add a mutating webhook to protect hco namespace 2021-02-08 07:59:52 UTC
Red Hat Product Errata RHSA-2021:0799 0 None None None 2021-03-10 11:19:14 UTC

Description Simone Tiraboschi 2020-09-23 09:35:25 UTC
Description of problem:
openshift-cnv namespace is getting stuck if the user tries to delete it while CNV is running.

In OpenShift Virtualization docs we properly document the uninstall process starting from HCO CR, but we see more than one user trying to directly delete openshift-cnv namespace.

This is going to end with openshift-cnv namespace getting stuck because:
- operators are deleted as a side effects of the deletion
- HCO CR (and others) contains a finalizar that can only be removed by HCO operator which is not there anymore.

Version-Release number of selected component (if applicable):
2.4.1

How reproducible:
100%

Steps to Reproduce:
1. deploy OpenShift Virtualization
2. while it's still running, try to directly delete openshift-cnv namespace
3.

Actual results:
openshift-cnv deletion is going to be stuck until the user manually removes the finalizers set on remaining resources

Expected results:
The user is prevented from removing openshift-cnv namespace while it still contains HCO CR.

Additional info:
This is not going to address a cascade deletion of OpenShift Virtualization starting from the deletion of openshift-cnv namespace but just to prevent it.

Comment 1 Inbar Rose 2020-09-23 12:35:07 UTC
proposed solution is simply to prevent the deletion of the namespace at this stage

Comment 2 Simone Tiraboschi 2020-09-25 09:34:09 UTC
I think we should target this to 2.5.z, not sure we can really do it for 2.5.0.

Setting 2.6.0 just because 2.5.z is still not visible in the dropdown list.

Comment 5 Inbar Rose 2020-12-09 13:40:03 UTC
*** Bug 1904147 has been marked as a duplicate of this bug. ***

Comment 9 Simone Tiraboschi 2021-01-12 14:29:10 UTC
*** Bug 1901642 has been marked as a duplicate of this bug. ***

Comment 13 errata-xmlrpc 2021-03-10 11:18:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Virtualization 2.6.0 security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:0799

Comment 15 ibesso 2021-03-11 09:00:43 UTC
I encountered this issue for the first time now, although I have been iterating install & uninstall CNV for more than several times.
The course of action is as follows:
1. run the test_install_cnv test (still in development) with OSBS (registry-proxy.engineering.redhat.com/rh-osbs/iib:50303) - it starts from scratch by creating the openshift-cnv namespace, operatorgroup, imagecontentsourcepolicy, etc.

2. sometimes the test did not pass, so when it failed I would use the delete commands that would help me start from scratch.

3. even when it passed, I would use the same commands to destroy everything required so the install test will run again and possibly pass without conflicts (i.e. to prevent it from failing when trying to create an entity that already exists).

4. run the following commands as one command:
oc delete imagecontentsourcepolicy --all ; \
oc delete hyperconverged kubevirt-hyperconverged -n openshift-cnv ; \
oc delete ip --all -n openshift-cnv ; \
oc delete catsrc rh-osbs-operators redhat-operators-stage -n openshift-marketplace; \
oc delete subs kubevirt-hyperconverged -n openshift-cnv ; \
oc delete og --all -n openshift-cnv ; \
oc delete pdb rook-ceph-osd-rack-rack0 rook-ceph-osd-rack-rack1 rook-ceph-osd-rack-rack2 -n openshift-storage ; \
oc delete csv kubevirt-hyperconverged-operator.v2.6.0 -n openshift-cnv ; \
oc delete pod `oc get pod -n openshift-cnv |grep hco |awk '{print $1}'` -n openshift-cnv ; \
oc delete crd hyperconvergeds.hco.kubevirt.io ; \
oc delete ns openshift-cnv


[cnv-qe-jenkins@iuo01-issac-fp49k-executor ~]$ oc version
Client Version: 4.7.0-202102032256.p0-c66c03f
Server Version: 4.7.0
Kubernetes Version: v1.20.0+bd9e442
[cnv-qe-jenkins@iuo01-issac-fp49k-executor ~]$

Comment 16 ibesso 2021-03-11 10:02:52 UTC
Please disregard my last comment.
It was found that the problem was quite a different scenario: CRD was deleted and CR remained.

I will try to reproduce it and file a new bug, should it reproduce.

Comment 17 Peter Lauterbach 2021-03-19 18:36:33 UTC
Simone, can we back port this fix to a CNV 2.5.z stream?

Comment 18 Simone Tiraboschi 2021-03-22 14:26:49 UTC
(In reply to Peter Lauterbach from comment #17)
> Simone, can we back port this fix to a CNV 2.5.z stream?

Unfortunately it's not a minor change and we refactored a lot of code between 2.5 and 2.6 so backporting it will require a considerable effort.


Note You need to log in before you can comment on or make changes to this bug.