Bug 1973227

Summary: segfault in virt-controller during pdb deletion
Product: Container Native Virtualization (CNV) Reporter: David Vossel <dvossel>
Component: VirtualizationAssignee: David Vossel <dvossel>
Status: CLOSED ERRATA QA Contact: Israel Pinto <ipinto>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 2.6.5CC: cnv-qe-bugs, fdeutsch, sgott, zpeng
Target Milestone: ---   
Target Release: 2.6.6   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: virt-operator-container-v2.6.6-5 hco-bundle-registry-container-v2.6.6-34 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-08-10 17:33:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description David Vossel 2021-06-17 13:15:11 UTC
Description of problem:

virt-controller crashes during pdb deletion if vmi no longer exists.


How reproducible:

random

Steps to Reproduce:
1. force delete a VMI with EvictionStrategy: LiveMigrate set by removing the VMIs finalizers during deletion

This can cause virt-controller to crash

Comment 1 David Vossel 2021-06-17 13:15:37 UTC
pr posted upstream. https://github.com/kubevirt/kubevirt/pull/5849

Comment 2 David Vossel 2021-06-23 12:28:21 UTC
pr merged into main. I'll need to create manual backports

Comment 3 David Vossel 2021-06-23 15:53:36 UTC
backports created, still waiting on them to be merged

pr for 0.41 - https://github.com/kubevirt/kubevirt/pull/5917
pr for 0.36 - https://github.com/kubevirt/kubevirt/pull/5919
pr for 0.34 - https://github.com/kubevirt/kubevirt/pull/5921

Comment 7 sgott 2021-07-06 20:37:09 UTC
Steps to verify:

force delete a VMI with EvictionStrategy: LiveMigrate set by removing the VMIs finalizers during deletion
observe that virt-controller does not crash.

Comment 8 zhe peng 2021-07-21 08:13:49 UTC
verify with build:
virt-operator-container-v2.6.6-5
hco-bundle-registry-container-v2.6.6-35

step:
1. create a vm with evictionStrategy: LiveMigrate
2. start vm and check vmi status
3. $oc describe vmi rhel84
...
Metadata:
  Creation Timestamp:  2021-07-21T07:37:20Z
  Finalizers:
    foregroundDeleteVirtualMachine
  Generation:  9
...
Eviction Strategy:  LiveMigrate
  Hostname:           rhel84
  Networks:
    Name:  default
...
4. force delete vmi
$ oc delete vmi rhel84 --force
5. at same time open another console to edit vmi, remove vmis finalizer.
$ oc edit vmi rhel84
6. check Virt-controller, no crash occur
$oc get pods -n OpenShift-cnv
...
Virt-controller-bbbf5d87d-r6r94                       1/1     Running   170        2d15h
virt-controller-bbbf5d87d-sr55t                       1/1     Running   162        2d15h

move to verified.

Comment 13 errata-xmlrpc 2021-08-10 17:33:37 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Virtualization 2.6.6 Images security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:3119