Description of problem: Raising this bug, just to track the changes around Node cordon, which triggers migration. 1) We currently trigger evacuation on Cordon too. Looking at this bug https://bugzilla.redhat.com/show_bug.cgi?id=1740137 2) and in the kubevirt-config cm we have the below taint key. ]$ oc get cm kubevirt-config -n openshift-cnv -o yaml | grep migration migrations: '{"nodeDrainTaintKey" : "node.kubernetes.io/unschedulable"}' With the changes around Eviction Webhook, should now trigger a migration. Version-Release number of selected component (if applicable): CNV-2.6 How reproducible: Steps to Reproduce: 1. "oc adm cordon <node-name>" ( Yes, Cordon, not Drain ) 2. migrations: '{"nodeDrainTaintKey" : "node.kubernetes.io/unschedulable"}' in kubevirt-config cm 3. Actual results: migrations: '{"nodeDrainTaintKey" : "node.kubernetes.io/unschedulable"}' Currently due to the above TaintKey, it currently triggers VMI Migration even upon Cordon of nodes. Expected results: We may want to update the kubevirt-config cm and drop the taint key. Cordon of nodes, should not trigger VMI Migration. Additional info: Probably we just want to revert the changes introduced by this bug https://bugzilla.redhat.com/show_bug.cgi?id=1740137 As we have the new "Eviction Webhook".
I can reproduce this. verify with build hco-bundle-registry:v2.5.0-405 step: 1 create a vm, running on node cordon node $oc adm cordon <node-name> check node status: zpeng-ocp46-gs6cv-worker-0-wnx4w Ready,SchedulingDisabled worker 2d15h v1.19.0+d59ce34 check vm pod $ oc get pods NAME READY STATUS RESTARTS AGE virt-launcher-vm-fedora-8dz6x 1/1 Running 0 38m no migration start $ oc get cm kubevirt-config -n openshift-cnv -o yaml | grep migration no output move to verified.
Reopened on hco-bundle-registry-container-v2.5.0-427: Node cordon triggers VMI migration wind-template-node-cordon-and-drain-1605018436-7155707 61s Running 10.131.1.65 cnv-qe-04.cnvqe.lab.eng.rdu2.redhat.com wind-template-node-cordon-and-drain-1605018436-7155707 64s Running 10.131.1.65 cnv-qe-05.cnvqe.lab.eng.rdu2.redhat.com $ oc get pod -n virt-migration-and-maintenance-test-node-maintenance -owide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES virt-launcher-wind-template-node-cordon-and-drain-160501842dsq6 0/1 Completed 0 6m9s 10.128.3.251 cnv-qe-05.cnvqe.lab.eng.rdu2.redhat.com <none> <none> virt-launcher-wind-template-node-cordon-and-drain-16050184ccwk2 1/1 Running 0 5m51s 10.131.1.67 cnv-qe-04.cnvqe.lab.eng.rdu2.redhat.com <none> <none> virt-launcher-wind-template-node-cordon-and-drain-16050184d958x 0/1 Completed 0 5m59s 10.129.2.96 cnv-qe-06.cnvqe.lab.eng.rdu2.redhat.com <none> <none> $ oc get virtualmachineinstancemigration -n virt-migration-and-maintenance-test-node-maintenance NAME AGE kubevirt-evacuation-5ppcr 19s kubevirt-evacuation-w4lpq 37s $ oc describe virtualmachineinstancemigration -n virt-migration-and-maintenance-test-node-maintenance kubevirt-evacuation-5ppcr Name: kubevirt-evacuation-5ppcr Namespace: virt-migration-and-maintenance-test-node-maintenance Labels: <none> Annotations: kubevirt.io/latest-observed-api-version: v1alpha3 kubevirt.io/storage-observed-api-version: v1alpha3 API Version: kubevirt.io/v1alpha3 Kind: VirtualMachineInstanceMigration Metadata: Creation Timestamp: 2020-11-10T14:28:30Z Generate Name: kubevirt-evacuation- Generation: 1 Managed Fields: API Version: kubevirt.io/v1alpha3 Fields Type: FieldsV1 fieldsV1: f:metadata: f:annotations: .: f:kubevirt.io/latest-observed-api-version: f:kubevirt.io/storage-observed-api-version: f:generateName: f:spec: .: f:vmiName: f:status: .: f:phase: Manager: virt-controller Operation: Update Time: 2020-11-10T14:28:39Z Resource Version: 37886083 Self Link: /apis/kubevirt.io/v1alpha3/namespaces/virt-migration-and-maintenance-test-node-maintenance/virtualmachineinstancemigrations/kubevirt-evacuation-5ppcr UID: 08df6e27-f3a0-4225-8c38-59d1cf37db2f Spec: Vmi Name: wind-template-node-cordon-and-drain-1605018436-7155707 Status: Phase: Succeeded Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal SuccessfulCreate 36s virtualmachine-controller Created migration target pod virt-launcher-wind-template-node-cordon-and-drain-16050184ccwk2 Normal SuccessfulHandOver 31s virtualmachine-controller Migration target pod is ready for preparation by virt-handler. Normal SuccessfulMigration 27s virtualmachine-controller Source node reported migration succeeded $ oc describe virtualmachineinstancemigration -n virt-migration-and-maintenance-test-node-maintenance kubevirt-evacuation-w4lpq Name: kubevirt-evacuation-w4lpq Namespace: virt-migration-and-maintenance-test-node-maintenance Labels: <none> Annotations: kubevirt.io/latest-observed-api-version: v1alpha3 kubevirt.io/storage-observed-api-version: v1alpha3 API Version: kubevirt.io/v1alpha3 Kind: VirtualMachineInstanceMigration Metadata: Creation Timestamp: 2020-11-10T14:28:12Z Generate Name: kubevirt-evacuation- Generation: 1 Managed Fields: API Version: kubevirt.io/v1alpha3 Fields Type: FieldsV1 fieldsV1: f:metadata: f:annotations: .: f:kubevirt.io/latest-observed-api-version: f:kubevirt.io/storage-observed-api-version: f:generateName: f:spec: .: f:vmiName: f:status: .: f:phase: Manager: virt-controller Operation: Update Time: 2020-11-10T14:28:21Z Resource Version: 37885290 Self Link: /apis/kubevirt.io/v1alpha3/namespaces/virt-migration-and-maintenance-test-node-maintenance/virtualmachineinstancemigrations/kubevirt-evacuation-w4lpq UID: dfda2310-b0c5-431b-a70b-22eae49fb6de Spec: Vmi Name: wind-template-node-cordon-and-drain-1605018436-7155707 Status: Phase: Succeeded Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal SuccessfulCreate 57s virtualmachine-controller Created migration target pod virt-launcher-wind-template-node-cordon-and-drain-160501842dsq6 Normal SuccessfulHandOver 52s virtualmachine-controller Migration target pod is ready for preparation by virt-handler. Normal SuccessfulMigration 48s virtualmachine-controller Source node reported migration succeeded $ oc get cm -n openshift-cnv kubevirt-config -oyaml apiVersion: v1 data: default-network-interface: masquerade feature-gates: DataVolumes,SRIOV,LiveMigration,CPUManager,CPUNodeDiscovery,Sidecar,Snapshot machine-type: pc-q35-rhel8.2.0 selinuxLauncherType: virt_launcher.process smbios: |- Family: Red Hat Product: Container-native virtualization Manufacturer: Red Hat Sku: 2.5.0 Version: 2.5.0 kind: ConfigMap metadata: creationTimestamp: "2020-11-05T22:23:53Z" labels: app: kubevirt-hyperconverged managedFields: - apiVersion: v1 fieldsType: FieldsV1 fieldsV1: f:data: .: {} f:default-network-interface: {} f:feature-gates: {} f:selinuxLauncherType: {} f:smbios: {} f:metadata: f:labels: .: {} f:app: {} f:ownerReferences: .: {} k:{"uid":"8b934e34-7628-492b-bf9e-e4d85debb3ad"}: .: {} f:apiVersion: {} f:blockOwnerDeletion: {} f:controller: {} f:kind: {} f:name: {} f:uid: {} manager: hyperconverged-cluster-operator operation: Update time: "2020-11-05T22:23:53Z" - apiVersion: v1 fieldsType: FieldsV1 fieldsV1: f:data: f:machine-type: {} manager: OpenAPI-Generator operation: Update time: "2020-11-05T22:53:09Z" name: kubevirt-config namespace: openshift-cnv ownerReferences: - apiVersion: hco.kubevirt.io/v1beta1 blockOwnerDeletion: true controller: true kind: HyperConverged name: kubevirt-hyperconverged uid: 8b934e34-7628-492b-bf9e-e4d85debb3ad resourceVersion: "25322787" selfLink: /api/v1/namespaces/openshift-cnv/configmaps/kubevirt-config uid: 011f6be1-3b08-46e9-b337-377617d3e460
Updated the fixed-in version to include a specific deployment of HCO.
Appologies, Comment #6 was inclomplete. Can you please verify you were using hyperconverged-cluster-operator-container-v2.5.0-52 or newer when you observed this, Ruth?
Moving to verified; will fix the automation test (due to bug 1888790, there are leftover migration jobs)
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Virtualization 2.5.0 Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2020:5127