Description of problem: 1. Prepare a cluster with worked nodes with different TSC frequencies 2. Create a Windows VM, with re-enlightenment enabled 3. Start the VM 4. Note it was scheduled on node N, and labeled with the node tsc frequency of that node tsc-frequency-2419200000 5. Stop the VM 6. So far all ok. 7. Make all nodes with that TSC 2419200000 frequency unschedulable 8. Start the VM 9. Pod fails to schedule Because, the VMI has: topologyHints: tscFrequency: 2419200000 Which translates to a NodeSelector on the virt-launcher nodeSelector: hyperv.node.kubevirt.io/frequencies: "true" hyperv.node.kubevirt.io/ipi: "true" hyperv.node.kubevirt.io/reenlightenment: "true" hyperv.node.kubevirt.io/reset: "true" hyperv.node.kubevirt.io/runtime: "true" hyperv.node.kubevirt.io/synic: "true" hyperv.node.kubevirt.io/synictimer: "true" hyperv.node.kubevirt.io/tlbflush: "true" hyperv.node.kubevirt.io/vpindex: "true" kubevirt.io/schedulable: "true" scheduling.node.kubevirt.io/tsc-frequency-2419200000: "true" <-------- The only node available has a different TSC: % oc get nodes NAME STATUS ROLES AGE VERSION black.toca.local Ready,SchedulingDisabled worker 9d v1.25.7+eab9cc9 blue.toca.local Ready,SchedulingDisabled control-plane,master,worker 10d v1.25.7+eab9cc9 green.toca.local Ready,SchedulingDisabled control-plane,master,worker 10d v1.25.7+eab9cc9 indigo.toca.local Ready,SchedulingDisabled worker 10d v1.25.7+eab9cc9 red.toca.local Ready,SchedulingDisabled control-plane,master,worker 10d v1.25.7+eab9cc9 violet.toca.local Ready,SchedulingDisabled worker 10d v1.25.7+eab9cc9 white.toca.local Ready worker 10d v1.25.7+eab9cc9 yellow.toca.local Ready,SchedulingDisabled worker 10d v1.25.7+eab9cc9 % oc get nodes white.toca.local -o yaml | grep tsc-frequency cpu-timer.node.kubevirt.io/tsc-frequency: "2592000000" scheduling.node.kubevirt.io/tsc-frequency-2592000000: "true" And no go: message: '0/8 nodes are available: 1 node(s) didn''t match Pod''s node affinity/selector, 7 node(s) were unschedulable. preemption: 0/8 nodes are available: 8 Preemption is not helpful for scheduling.' When starting from cold, this should not be necessary. Apparently this comes from TopologyHinter, and effectivelly concentrates new VM starts on hosts with exact same TSC of the hosts they ran previously. Version-Release number of selected component (if applicable): OCP 4.12.9 CNV 4.12.2 How reproducible: Always Steps to Reproduce: As above Actual results: - VM fails to schedule on all nodes Expected results: - Fresh VM start can schedule on all nodes
Actually, its not the frequency of the previous host, it seems to be the lowest in the cluster: https://github.com/kubevirt/kubevirt/blob/main/pkg/virt-controller/watch/topology/hinter.go#L40
May be of interest: https://listman.redhat.com/archives/libvir-list/2020-November/msg00519.html