Bug 1900631
Summary: | [CNV 2.4.3] oc vm delete doesn't complete sometimes | ||
---|---|---|---|
Product: | Container Native Virtualization (CNV) | Reporter: | Benjamin Schmaus <bschmaus> |
Component: | Virtualization | Assignee: | aschuett <aschuett> |
Status: | CLOSED ERRATA | QA Contact: | Israel Pinto <ipinto> |
Severity: | high | Docs Contact: | |
Priority: | urgent | ||
Version: | 2.4.3 | CC: | bschmaus, cnv-qe-bugs, fdeutsch, kbidarka, sgott, spuranam, zpeng |
Target Milestone: | --- | ||
Target Release: | 4.8.1 | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | hco-bundle-registry-container-v4.8.1-14 virt-operator-container-v4.8.1-2 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2021-08-24 12:48:59 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Benjamin Schmaus
2020-11-23 12:57:14 UTC
Is it possible to get some more information? in particular must-gather ouput would give us some more context here. Without any other context, this sounds like it might be a dup of https://bugzilla.redhat.com/show_bug.cgi?id=1896387 That BZ is likely due to an issue in CRI-O which is being addressed in CNV 4.7. Thus I'm deferring this pending a fix in OCP. To complete the loop here, https://bugzilla.redhat.com/show_bug.cgi?id=1896387#c8 mentioned this BZ https://bugzilla.redhat.com/show_bug.cgi?id=1883991 which is what I was referring to in Comment #6 I have been able to reproduce this in 4.7.3 with a Windows guest VM. Start Windows 2019 VM and then from OCP console try to stop the VM - seems to hang. Now if I repeat the same steps but go into Windows 2019 VM and shutdown - and then before the Windows VM shuts down stop it in OCP console it will stop properly. Given my statement above do we believe this is still related to CRI-O as indicated in comment 6? Ben, There exists a BZ where the reporter created a VM and then deleted it immediately--on a windows VM. https://bugzilla.redhat.com/show_bug.cgi?id=1933043 In some cases, this causes graceful shutdown to fail--at which point the VMI will wait for terminationGracePeriodSeconds to be deleted. This is especially noticeable on Windows because the grace period is quite long (to ensure we don't break Windows updates). Does this appear similar to what you're experiencing? What were the TerminationGracePeriodSeconds for those that are able to terminate immediately vs those that hang? It seems that when doing ephemeral VMs they used 60 seconds otherwise 3600 seconds for VMs that might get created but stay up awhile. verify with build HCO:[v4.8.1-18] step: 1. create 50 vms with same dv source 2. start all vms, waiting vm all in running status 3. destroy all vms, check vm and vmi status all vm and vmi deleted. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Virtualization 4.8.1 Images security and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3259 |