Bug 2008900
| Summary: | Eviction of not live migratable VMs due to virt-launcher upgrade can happen outside the upgrade window | ||
|---|---|---|---|
| Product: | Container Native Virtualization (CNV) | Reporter: | Simone Tiraboschi <stirabos> |
| Component: | Installation | Assignee: | Simone Tiraboschi <stirabos> |
| Status: | CLOSED ERRATA | QA Contact: | Debarati Basu-Nag <dbasunag> |
| Severity: | urgent | Docs Contact: | |
| Priority: | urgent | ||
| Version: | 4.9.0 | CC: | cnv-qe-bugs, oramraz, pelauter, stirabos |
| Target Milestone: | --- | ||
| Target Release: | 4.9.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | hco-bundle-registry-container-v4.9.0-223 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-11-02 16:01:09 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Validated by default, hco.workloadUpdateStrategy.workloadUpdateMethods is now "LiveMigrate"
=========================
progressTimeout: 150
workloadUpdateStrategy:
batchEvictionInterval: 1m0s
batchEvictionSize: 10
workloadUpdateMethods:
- LiveMigrate
workloads: {}
==========================
Build used:
Deployed: OCP-4.9.0-rc.4
Deployed: CNV-v4.9.0-223
Waiting on a cluster to perform upgrade test to complete validation of this bug. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Virtualization 4.9.0 Images security and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:4104 |
Description of problem: The default configuration on HCO CR is: workloadUpdateStrategy: workloadUpdateMethods: - LiveMigrate - Evict batchEvictSize: 10 batchEvictInterval: "1m" this means that *during* CNV upgrades, VMs are trying to be live migrated or eventually evicted in order to be sure that all the VMs are going to be executed with an up to date version of virt-launcher. The issue is that virt-operator is going to report (with its conditions) that the upgrade completed as soon as the upgrade of its control plane completes while the upgrade of virt-launcher is completely asynchronous. So the user can eventually see VMs getting evicted due to a CNV upgrade after the upgrade is already reported to be completed. Version-Release number of selected component (if applicable): How reproducible: 100% Steps to Reproduce: 1. check HCO CR for defaults values of workloadUpdateMethods 2. 3. Actual results: workloadUpdateMethods: - LiveMigrate - Evict Expected results: workloadUpdateMethods: - LiveMigrate Additional info: