Description of problem: Newly updated 4.10 cluster is reporting infrastructureTopology as SingleReplica so LiveMigration feature gate was disabled. LiveMigration is left in config as a workloadUpdateStrategy, and virt-controller is in an infinite loop of trying to migrate/update VMs in the cluster, failing the test for LiveMigration FG, and requeueing the update. Messages generated at *many* per second, too fast to read logs. Version-Release number of selected component (if applicable): 4.10 How reproducible: unknown, could be triggered by a bug in OCP 4.10.3 (unreported yet) where SingleReplica is set despite having 2 workers. Steps to Reproduce: Unsure, this was a complicated upgrade situation, but could try: 1. Create 2 node cluster in 4.9 2. Install OCP Virtualization 4.9 3. Install a VM (fedora, or whichever) 4. Upgrade cluster to 4.10 5. Upgrade CNV to 4.10 Actual results: CNV disables livemigration FG but enables livemigration workloadUpdateStrategy Expected results: CNV disables livemigraiton FG AND disables livemigration workloadUpdateStrategy Additional info: Must gather output: ClusterID: e2e1b54a-3aa5-4086-9d27-17f085c01290 ClusterVersion: Stable at "4.10.3" ClusterOperators: All healthy and stable
Verified against 4.10.0-36: =========================== Both livemigration fg and workloadUpdateStrategy is disabled "featureGates": [ "DataVolumes", "SRIOV", "CPUManager", "CPUNodeDiscovery", "Snapshot", "HotplugVolumes", "ExpandDisks", "GPU", "HostDevices", "DownwardMetrics", "NUMA", "WithHostModelCPU", "HypervStrictCheck" ] =========================== "workloadUpdateStrategy": { "batchEvictionInterval": "1m0s", "batchEvictionSize": 10 } updating hco to enable/disable workload update strategy did not have any effect.
We reverted this fix, as for https://bugzilla.redhat.com/show_bug.cgi?id=2073880 , because it was creating other issues.