Description of problem: The default configuration on HCO CR is: workloadUpdateStrategy: workloadUpdateMethods: - LiveMigrate - Evict batchEvictSize: 10 batchEvictInterval: "1m" this means that *during* CNV upgrades, VMs are trying to be live migrated or eventually evicted in order to be sure that all the VMs are going to be executed with an up to date version of virt-launcher. The issue is that virt-operator is going to report (with its conditions) that the upgrade completed as soon as the upgrade of its control plane completes while the upgrade of virt-launcher is completely asynchronous. So the user can eventually see VMs getting evicted due to a CNV upgrade after the upgrade is already reported to be completed. Version-Release number of selected component (if applicable): How reproducible: 100% Steps to Reproduce: 1. check HCO CR for defaults values of workloadUpdateMethods 2. 3. Actual results: workloadUpdateMethods: - LiveMigrate - Evict Expected results: workloadUpdateMethods: - LiveMigrate Additional info:
Validated by default, hco.workloadUpdateStrategy.workloadUpdateMethods is now "LiveMigrate" ========================= progressTimeout: 150 workloadUpdateStrategy: batchEvictionInterval: 1m0s batchEvictionSize: 10 workloadUpdateMethods: - LiveMigrate workloads: {} ========================== Build used: Deployed: OCP-4.9.0-rc.4 Deployed: CNV-v4.9.0-223
Waiting on a cluster to perform upgrade test to complete validation of this bug.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Virtualization 4.9.0 Images security and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:4104