Created attachment 1763858 [details]
Description of problem:
If there is still a VM in the cluster, delete the HCO is stucking, click delete button nothing is happening and stays on the delete dialog.
This is a regression from OCP 4.6, on OCP 4.6, it pops up a proper error like below:
An error occurred
admission webhook "validate-hco.kubevirt.io" denied the request: admission webhook "kubevirt-validator.kubevirt.io" denied the request: Rejecting the uninstall request, since there are still Virtual Machines present. Either delete all KubeVirt related workloads or change the uninstall strategy before uninstalling KubeVirt.
Version-Release number of selected component (if applicable):
OCP 4.7 and OCP 4.8
Steps to Reproduce:
1. have a VM in the cluster
2. go to Operators -> installed operators
3. select 'Openshift virtualization' in openshift-cnv namespace
4. select 'Openshift virtualization Deployment'
5. delete 'kubevirt-HyperConverged'
delete is stucking
a proper error message shows.
This bug was originally filed under kubevirt-console-plugin, the missing error msg happen in the OLM-console-plugin, maybe it's the actual operator HCO that does not response with currect error message ?
If OLM can't help here, please refer this bug to the correct component.
This is a UI bug, not HCO or OLM.
The expected error message is shown when performing the CR deletion from the CLI (oc delete hco kubevirt-hyperconverged -n openshift-cnv), and the thrown error message, originating from the validating webhook, is:
Error from server (admission webhook "kubevirt-validator.kubevirt.io" denied the request: Rejecting the uninstall request, since there are still Virtual Machines present. Either delete all KubeVirt related workloads or change the uninstall strategy before uninstalling KubeVirt.): admission webhook "validate-hco.kubevirt.io" denied the request: admission webhook "kubevirt-validator.kubevirt.io" denied the request: Rejecting the uninstall request, since there are still Virtual Machines present. Either delete all KubeVirt related workloads or change the uninstall strategy before uninstalling KubeVirt.
The same error message should be displayed when performing the same action in the UI. But instead the red "Delete" button is stuck at "pressed" state.
Note: the error message is displayed at the console of the browser when in developer mode (F12), when clicking on the Delete button.
In OCP 4.6, that message appeared on the UI, as expected. See the screenshot Guohua attached.
@Oren thanks, do you know what is the correct component for this bug ?
The kubvirt-plugin UI does not cover this part of the code, so this UI may come from HCO UI ? or it's a generic UI that OLM generate from HCO definitions ?
Hi, moving to management console per comment#2 and commant#3
Not sure if this is the correct component, if you can't help, please move to correct component.
I think the issue is indeed with the general management console, since it occurs both in "Installed Operators" page --> the CR tab, and in Administration --> CustomResourceDefinitions --> find the relevant CRD --> Instances.
I didn't check, but I assume the issue is not affecting only CNV, but every resource that is protected by a validating webhook, and it's denying the deletion request of the resource.
Rebecca could you please check if the issue is somehow related to your change in https://github.com/openshift/console/pull/6887
I doubt that, from first look but lets check it first. If not please assign it back to me.
That's a weird one. I'll look into it!
Hi @gouyang - I've trying to recreate this. Are there other steps required that weren't outlined in the ticket? The delete button is working just fine for me. I'm trying to figure out if there's a step I'm missing! Thanks.
Hi @ralpert, the reproducer for this bug is exactly what outlines by @gouyang:
1. Install CNV 2.6.0 from production on OCP 4.7
2. Create a Virtual Machine
3. Delete the HyperConverged Custom Resource named "kubevirt-hyperconverged" in namespace "openshift-cnv" via the UI.
Result: the red delete button is stuck. the error message is shown in browser's developer mode.
Please see attached screen recording:
Oren, thanks for the video.
@ralpert, Maybe there are some differences between your cluster and CNV QE's cluster which cause you cannot reproduce the issue.
I reproduced the issue on a QuickLab cluster, which is a different infrastructure than CNV QE (PSI).
Thanks @gouyang and @ocohen!
I'm going to talk to my team and see if anyone has any ideas what may be different between the clusters.
Hi @gouyang and @ocohen - I spoke to folks on my team and it sounds like we may need a cluster with the problem to debug this issue. Would you be able to provide us with one so we can knock this out? Thanks so much!
I'll provide you the details for accessing the cluster shortly, over a private channel.
Steps to verify:
1. Login to console, go to operator hub and install "OpenShift Virtualization" operator
2. Once installed, go to operator details Openshift Virtualization Deployment and create a HyperConverged.
3. After HyperConverged is created, go to workloads>virtualization and create a VM
4. While VM is running, go to Installed Operator and try to delete the HyperConverged.
Now the error is getting displayed, attached is the screen shot.
Created attachment 1779178 [details]
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.