Bug 1712429 - delete project kubevirt-hyperconverged stucked in Terminating state due to kubevirt apiserver
Summary: delete project kubevirt-hyperconverged stucked in Terminating state due to ku...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Container Native Virtualization (CNV)
Classification: Red Hat
Component: Virtualization
Version: 2.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 2.3.0
Assignee: Roman Mohr
QA Contact: zhe peng
URL:
Whiteboard:
: 1685911 (view as bug list)
Depends On:
Blocks: 1781293
TreeView+ depends on / blocked
 
Reported: 2019-05-21 13:39 UTC by Irina Gulina
Modified: 2020-05-04 19:10 UTC (History)
13 users (show)

Fixed In Version: virt-operator-container-v2.3.0-30 hco-bundle-registry-container-v2.2.0-363
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-05-04 19:10:36 UTC
Target Upstream Version:


Attachments (Terms of Use)
virt-api marked as ServiceNotFound (15.30 KB, text/plain)
2020-01-27 14:41 UTC, Irina Gulina
no flags Details


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2020:2011 None None None 2020-05-04 19:10:47 UTC

Comment 4 Fabian Deutsch 2019-05-22 20:41:31 UTC
We are aware of this problem, I'm also in favor of getting this fixed, but it's likely not an easy fix.
Ideally kubevirt - or HCO - would detect that we delete thte namespace and doa proper cleanup.

Comment 7 Fabian Deutsch 2019-05-24 08:50:37 UTC
This is probably tricky to solve, but still a convenient thing.

Comment 8 Ryan Hallisey 2019-05-28 14:21:58 UTC
I think we would need the virt-operator to do 'oc delete apiservices v1alpha3.subresources.kubevirt.io' when the virt cr is removed.

Comment 9 Fabian Deutsch 2019-05-28 14:25:45 UTC
Marc, the workaround that is done is to call:
oc delete apiservices v1alpha3.subresources.kubevirt.io


This sounds like the virt-operator does not remove the apiservice registration upon removal, is this the case?

Comment 10 Fabian Deutsch 2019-05-28 14:26:37 UTC
Lev, can you please describe and point to the logic that removes KubeVirt?

Comment 11 Lev Veyde 2019-05-28 14:57:32 UTC
(In reply to Fabian Deutsch from comment #10)
> Lev, can you please describe and point to the logic that removes KubeVirt?

We first delete the relevant KubeVirt CRs, which should cause the KubeVirt to stop the relevant services - this is done by the HCO operator itself once the HCO CR is deleted.

Then we delete the KubeVirt (and other) operators from the kubevirt-hyperconverged ns, (the ones that were manually started).

Basically the current QA flow is the one from the Comment #2.

Comment 12 Fabian Deutsch 2019-05-29 12:53:15 UTC
Alright, it sounds like we should tkae care to remove all components if our CR is getting removed.

Comment 19 Irina Gulina 2019-06-03 07:10:41 UTC
@Pan, I would add a reference to that Bug and also duplicate it in the Known issues section, again with a reference to the bug.

Comment 20 Irina Gulina 2019-06-03 22:24:43 UTC
@Lev, you said one shouldn't remove CSV manually, but in the console, there is 'Edit and Delete Cluster Service Version' buttons in Catalog -> Installed Operators -> 3 dots menu next to KubeVirt HyperConverged Cluster Operator. If one hits that Delete button, the CSV will be removed. Should that button exist at all, then?

Comment 21 Lev Veyde 2019-06-05 08:28:25 UTC
(In reply to Irina Gulina from comment #19)
> @Pan, I would add a reference to that Bug and also duplicate it in the Known
> issues section, again with a reference to the bug.

In theory you can delete any resource and CSV is not an exception, not sure if we should start adding "foolproof"protections to the UI.

Comment 23 Irina Gulina 2019-06-10 11:55:34 UTC
Pan, thanks. It looks fine.

Comment 24 Fabian Deutsch 2019-06-20 12:07:27 UTC

*** This bug has been marked as a duplicate of bug 1685911 ***

Comment 28 sgott 2019-07-23 12:13:15 UTC
*** Bug 1685911 has been marked as a duplicate of this bug. ***

Comment 35 Roman Mohr 2020-01-21 13:13:56 UTC
https://github.com/kubevirt/kubevirt/pull/3006 is posted. I am not sure if it will solve the issue for HCO as a whole, because other CNV components (like CDI) could still block the delete.


Michael I guess you can immediately tell if you delete the apiserver registration when CDI gets uninstalled.

Comment 36 Michael Henriksen 2020-01-21 17:08:34 UTC
Interesting question, Roman.  CDI mostly depends on owner references to clean up when CDI is removed.  Hence, the CDI CR owns the ApiService.  So it should get deleted when the CDI CR is deleted.  But it's not.  At some point after the ApiService is created it appears to get updated and the owner reference is removed.  Along with other spec data.  Not sure where it's happening but I'm pretty confident it's not caused by a bug in CDI.  The end result is that the apiservice does not get deleted along with the rest of CDI.  Strangely, deleting with kubectl does not delete it either.  Not sure what is going on.  Must be the k8s bug Stu mentioned earlier?

Nevertheless, in my testing, this has not kept me from deleting the CDI install namespace.

One more note, CDI also puts owner references on mutating/valiating webhooks and those get cleaned up just as expected.

Comment 37 Irina Gulina 2020-01-27 14:41:25 UTC
Created attachment 1655693 [details]
virt-api marked as ServiceNotFound

Update: in 2.2, after deleting HCO CR, virt-api service is marked as ServiceNotFound, but not deleted. Yaml is still available after some time passes.

Comment 38 sgott 2020-02-19 17:16:52 UTC
Roman,

What's the state of this?

Comment 39 Roman Mohr 2020-02-19 17:18:57 UTC
This should be addressed in 2.3 with https://github.com/kubevirt/kubevirt/pull/3006.

Comment 40 zhe peng 2020-02-27 04:29:34 UTC
verify with build:
$oc version
Client Version: 4.4.0-0.nightly-2020-02-17-022408
Server Version: 4.4.0-0.nightly-2020-02-22-102956
Kubernetes Version: v1.17.1

step:
1.deploy HCO
$ oc get hco
NAME                      AGE
kubevirt-hyperconverged   42m
2.delete HCO
$ oc delete hco kubevirt-hyperconverged -n openshift-cnv
hyperconverged.hco.kubevirt.io "kubevirt-hyperconverged" deleted

$ oc get apiservices v1alpha3.subresources.kubevirt.io -n openshift-cnv
Error from server (NotFound): apiservices.apiregistration.k8s.io "v1alpha3.subresources.kubevirt.io" not found

$ oc delete sub hco-operatorhub -n openshift-cnv
subscription.operators.coreos.com "hco-operatorhub" deleted

$ oc delete catsrc rh-verified-operators -n openshift-marketplace
catalogsource.operators.coreos.com "rh-verified-operators" deleted

$ oc delete csv kubevirt-hyperconverged-operator.v2.3.0 -n openshift-cnv
clusterserviceversion.operators.coreos.com "kubevirt-hyperconverged-operator.v2.3.0" deleted

$ oc delete og openshift-cnv-group -n openshift-cnv
operatorgroup.operators.coreos.com "openshift-cnv-group" deleted

$ oc delete project openshift-cnv
project.project.openshift.io "openshift-cnv" deleted

$ oc describe project openshift-cnv
Error from server (NotFound): namespaces "openshift-cnv" not foun

the project can be deleted 
move to verified.

Comment 43 errata-xmlrpc 2020-05-04 19:10:36 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:2011


Note You need to log in before you can comment on or make changes to this bug.