Bug 2092269
| Summary: | We cant migrate to newer target node and than return to the source node when using host-model cpu | ||
|---|---|---|---|
| Product: | Container Native Virtualization (CNV) | Reporter: | Barak <bmordeha> |
| Component: | Virtualization | Assignee: | Barak <bmordeha> |
| Status: | CLOSED ERRATA | QA Contact: | Kedar Bidarkar <kbidarka> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 4.9.0 | CC: | acardace, akrgupta, cnv-qe-bugs, ibezukh, sgott |
| Target Milestone: | --- | ||
| Target Release: | 4.9.6 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | hco-bundle-registry-container-v4.9.6-26 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-09-22 08:17:00 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
update: backporting fix to 0.44 , 0.49 , 0.53 Deferring to the next point release as the backport is complicated and will take some time. Backport PRs: https://github.com/kubevirt/kubevirt/pull/7957 https://github.com/kubevirt/kubevirt/pull/7956 https://github.com/kubevirt/kubevirt/pull/7955 https://github.com/kubevirt/kubevirt/pull/7954 https://github.com/kubevirt/kubevirt/pull/7953 https://github.com/kubevirt/kubevirt/pull/7952 https://github.com/kubevirt/kubevirt/pull/7950 https://github.com/kubevirt/kubevirt/pull/7949 MERGED(currently): https://github.com/kubevirt/kubevirt/pull/7870 https://github.com/kubevirt/kubevirt/pull/7857 Verified with
[akrgupta@fedora auth]$for n in $(oc get node -o name | grep worker); do echo ""; echo $n;oc describe $n | grep "cpu-model.node.kubevirt.io"; done
node/virt-akr-49-pcdnh-worker-0-ft7rw
cpu-model.node.kubevirt.io/Haswell-noTSX=true
cpu-model.node.kubevirt.io/Haswell-noTSX-IBRS=true
cpu-model.node.kubevirt.io/IvyBridge=true
cpu-model.node.kubevirt.io/IvyBridge-IBRS=true
cpu-model.node.kubevirt.io/Nehalem=true
cpu-model.node.kubevirt.io/Nehalem-IBRS=true
cpu-model.node.kubevirt.io/Opteron_G1=true
cpu-model.node.kubevirt.io/Opteron_G2=true
cpu-model.node.kubevirt.io/Penryn=true
cpu-model.node.kubevirt.io/SandyBridge=true
cpu-model.node.kubevirt.io/SandyBridge-IBRS=true
cpu-model.node.kubevirt.io/Westmere=true
cpu-model.node.kubevirt.io/Westmere-IBRS=true
node/virt-akr-49-pcdnh-worker-0-ph99k
cpu-model.node.kubevirt.io/Broadwell=true
cpu-model.node.kubevirt.io/Broadwell-IBRS=true
cpu-model.node.kubevirt.io/Broadwell-noTSX=true
cpu-model.node.kubevirt.io/Broadwell-noTSX-IBRS=true
cpu-model.node.kubevirt.io/Haswell=true
cpu-model.node.kubevirt.io/Haswell-IBRS=true
cpu-model.node.kubevirt.io/Haswell-noTSX=true
cpu-model.node.kubevirt.io/Haswell-noTSX-IBRS=true
cpu-model.node.kubevirt.io/IvyBridge=true
cpu-model.node.kubevirt.io/IvyBridge-IBRS=true
cpu-model.node.kubevirt.io/Nehalem=true
cpu-model.node.kubevirt.io/Nehalem-IBRS=true
cpu-model.node.kubevirt.io/Opteron_G1=true
cpu-model.node.kubevirt.io/Opteron_G2=true
cpu-model.node.kubevirt.io/Penryn=true
cpu-model.node.kubevirt.io/SandyBridge=true
cpu-model.node.kubevirt.io/SandyBridge-IBRS=true
cpu-model.node.kubevirt.io/Skylake-Client=true
cpu-model.node.kubevirt.io/Skylake-Client-IBRS=true
cpu-model.node.kubevirt.io/Skylake-Client-noTSX-IBRS=true
cpu-model.node.kubevirt.io/Skylake-Server=true
cpu-model.node.kubevirt.io/Skylake-Server-IBRS=true
cpu-model.node.kubevirt.io/Skylake-Server-noTSX-IBRS=true
cpu-model.node.kubevirt.io/Westmere=true
cpu-model.node.kubevirt.io/Westmere-IBRS=true
node/virt-akr-49-pcdnh-worker-0-s7rqn
cpu-model.node.kubevirt.io/Broadwell=true
cpu-model.node.kubevirt.io/Broadwell-IBRS=true
cpu-model.node.kubevirt.io/Broadwell-noTSX=true
cpu-model.node.kubevirt.io/Broadwell-noTSX-IBRS=true
cpu-model.node.kubevirt.io/Haswell=true
cpu-model.node.kubevirt.io/Haswell-IBRS=true
cpu-model.node.kubevirt.io/Haswell-noTSX=true
cpu-model.node.kubevirt.io/Haswell-noTSX-IBRS=true
cpu-model.node.kubevirt.io/IvyBridge=true
cpu-model.node.kubevirt.io/IvyBridge-IBRS=true
cpu-model.node.kubevirt.io/Nehalem=true
cpu-model.node.kubevirt.io/Nehalem-IBRS=true
cpu-model.node.kubevirt.io/Opteron_G1=true
cpu-model.node.kubevirt.io/Opteron_G2=true
cpu-model.node.kubevirt.io/Penryn=true
cpu-model.node.kubevirt.io/SandyBridge=true
cpu-model.node.kubevirt.io/SandyBridge-IBRS=true
cpu-model.node.kubevirt.io/Westmere=true
cpu-model.node.kubevirt.io/Westmere-IBRS=true
[akrgupta@fedora auth]$ oc get nodes
NAME STATUS ROLES AGE VERSION
virt-akr-49-pcdnh-master-0 Ready master 3h24m v1.22.8+9e95cb9
virt-akr-49-pcdnh-master-1 Ready master 3h24m v1.22.8+9e95cb9
virt-akr-49-pcdnh-master-2 Ready master 3h23m v1.22.8+9e95cb9
virt-akr-49-pcdnh-worker-0-ft7rw Ready worker 3h7m v1.22.8+9e95cb9
virt-akr-49-pcdnh-worker-0-ph99k Ready,SchedulingDisabled worker 3h5m v1.22.8+9e95cb9
virt-akr-49-pcdnh-worker-0-s7rqn Ready,SchedulingDisabled worker 3h7m v1.22.8+9e95cb9
[akrgupta@fedora auth]$ oc get vm
NAME AGE STATUS READY
vm-fedora-hostmodel 23m Stopped False
[akrgupta@fedora auth]$ cat vm_yaml | grep spec -A 5
spec:
domain:
cpu:
cores: 1
model: host-model
devices:
[akrgupta@fedora auth]$ virtctl start vm-fedora-hostmodel
VM vm-fedora-hostmodel was scheduled to start
[akrgupta@fedora auth]$ oc get vmi
NAME AGE PHASE IP NODENAME READY
vm-fedora-hostmodel 73s Running 10.131.0.44 virt-akr-49-pcdnh-worker-0-ft7rw True
[akrgupta@fedora auth]$ oc adm uncordon virt-akr-49-pcdnh-worker-0-ph99k
node/virt-akr-49-pcdnh-worker-0-ph99k uncordoned
[akrgupta@fedora auth]$ oc get nodes
NAME STATUS ROLES AGE VERSION
virt-akr-49-pcdnh-master-0 Ready master 3h26m v1.22.8+9e95cb9
virt-akr-49-pcdnh-master-1 Ready master 3h26m v1.22.8+9e95cb9
virt-akr-49-pcdnh-master-2 Ready master 3h25m v1.22.8+9e95cb9
virt-akr-49-pcdnh-worker-0-ft7rw Ready worker 3h9m v1.22.8+9e95cb9
virt-akr-49-pcdnh-worker-0-ph99k Ready worker 3h7m v1.22.8+9e95cb9
virt-akr-49-pcdnh-worker-0-s7rqn Ready,SchedulingDisabled worker 3h9m v1.22.8+9e95cb9
[akrgupta@fedora auth]$ virtctl migrate vm-fedora-hostmodel
VM vm-fedora-hostmodel was scheduled to migrate
[akrgupta@fedora auth]$ oc get vmi
NAME AGE PHASE IP NODENAME READY
vm-fedora-hostmodel 3m37s Running 10.129.2.65 virt-akr-49-pcdnh-worker-0-ph99k True
[akrgupta@fedora auth]$ virtctl migrate vm-fedora-hostmodel
VM vm-fedora-hostmodel was scheduled to migrate
[akrgupta@fedora auth]$ oc get vmi
NAME AGE PHASE IP NODENAME READY
vm-fedora-hostmodel 5m23s Running 10.131.0.46 virt-akr-49-pcdnh-worker-0-ft7rw True
We can migrate to newer target node and than return to the source node when using host-model cpu
(In reply to Akriti Gupta from comment #5) Verified with v4.9.6-51 > > [akrgupta@fedora auth]$for n in $(oc get node -o name | grep worker); do > echo ""; echo $n;oc describe $n | grep "cpu-model.node.kubevirt.io"; done > node/virt-akr-49-pcdnh-worker-0-ft7rw > cpu-model.node.kubevirt.io/Haswell-noTSX=true > cpu-model.node.kubevirt.io/Haswell-noTSX-IBRS=true > cpu-model.node.kubevirt.io/IvyBridge=true > cpu-model.node.kubevirt.io/IvyBridge-IBRS=true > cpu-model.node.kubevirt.io/Nehalem=true > cpu-model.node.kubevirt.io/Nehalem-IBRS=true > cpu-model.node.kubevirt.io/Opteron_G1=true > cpu-model.node.kubevirt.io/Opteron_G2=true > cpu-model.node.kubevirt.io/Penryn=true > cpu-model.node.kubevirt.io/SandyBridge=true > cpu-model.node.kubevirt.io/SandyBridge-IBRS=true > cpu-model.node.kubevirt.io/Westmere=true > cpu-model.node.kubevirt.io/Westmere-IBRS=true > > node/virt-akr-49-pcdnh-worker-0-ph99k > cpu-model.node.kubevirt.io/Broadwell=true > cpu-model.node.kubevirt.io/Broadwell-IBRS=true > cpu-model.node.kubevirt.io/Broadwell-noTSX=true > cpu-model.node.kubevirt.io/Broadwell-noTSX-IBRS=true > cpu-model.node.kubevirt.io/Haswell=true > cpu-model.node.kubevirt.io/Haswell-IBRS=true > cpu-model.node.kubevirt.io/Haswell-noTSX=true > cpu-model.node.kubevirt.io/Haswell-noTSX-IBRS=true > cpu-model.node.kubevirt.io/IvyBridge=true > cpu-model.node.kubevirt.io/IvyBridge-IBRS=true > cpu-model.node.kubevirt.io/Nehalem=true > cpu-model.node.kubevirt.io/Nehalem-IBRS=true > cpu-model.node.kubevirt.io/Opteron_G1=true > cpu-model.node.kubevirt.io/Opteron_G2=true > cpu-model.node.kubevirt.io/Penryn=true > cpu-model.node.kubevirt.io/SandyBridge=true > cpu-model.node.kubevirt.io/SandyBridge-IBRS=true > cpu-model.node.kubevirt.io/Skylake-Client=true > cpu-model.node.kubevirt.io/Skylake-Client-IBRS=true > cpu-model.node.kubevirt.io/Skylake-Client-noTSX-IBRS=true > cpu-model.node.kubevirt.io/Skylake-Server=true > cpu-model.node.kubevirt.io/Skylake-Server-IBRS=true > cpu-model.node.kubevirt.io/Skylake-Server-noTSX-IBRS=true > cpu-model.node.kubevirt.io/Westmere=true > cpu-model.node.kubevirt.io/Westmere-IBRS=true > > node/virt-akr-49-pcdnh-worker-0-s7rqn > cpu-model.node.kubevirt.io/Broadwell=true > cpu-model.node.kubevirt.io/Broadwell-IBRS=true > cpu-model.node.kubevirt.io/Broadwell-noTSX=true > cpu-model.node.kubevirt.io/Broadwell-noTSX-IBRS=true > cpu-model.node.kubevirt.io/Haswell=true > cpu-model.node.kubevirt.io/Haswell-IBRS=true > cpu-model.node.kubevirt.io/Haswell-noTSX=true > cpu-model.node.kubevirt.io/Haswell-noTSX-IBRS=true > cpu-model.node.kubevirt.io/IvyBridge=true > cpu-model.node.kubevirt.io/IvyBridge-IBRS=true > cpu-model.node.kubevirt.io/Nehalem=true > cpu-model.node.kubevirt.io/Nehalem-IBRS=true > cpu-model.node.kubevirt.io/Opteron_G1=true > cpu-model.node.kubevirt.io/Opteron_G2=true > cpu-model.node.kubevirt.io/Penryn=true > cpu-model.node.kubevirt.io/SandyBridge=true > cpu-model.node.kubevirt.io/SandyBridge-IBRS=true > cpu-model.node.kubevirt.io/Westmere=true > cpu-model.node.kubevirt.io/Westmere-IBRS=true > > [akrgupta@fedora auth]$ oc get nodes > NAME STATUS ROLES AGE > VERSION > virt-akr-49-pcdnh-master-0 Ready master 3h24m > v1.22.8+9e95cb9 > virt-akr-49-pcdnh-master-1 Ready master 3h24m > v1.22.8+9e95cb9 > virt-akr-49-pcdnh-master-2 Ready master 3h23m > v1.22.8+9e95cb9 > virt-akr-49-pcdnh-worker-0-ft7rw Ready worker 3h7m > v1.22.8+9e95cb9 > virt-akr-49-pcdnh-worker-0-ph99k Ready,SchedulingDisabled worker 3h5m > v1.22.8+9e95cb9 > virt-akr-49-pcdnh-worker-0-s7rqn Ready,SchedulingDisabled worker 3h7m > v1.22.8+9e95cb9 > [akrgupta@fedora auth]$ oc get vm > NAME AGE STATUS READY > vm-fedora-hostmodel 23m Stopped False > [akrgupta@fedora auth]$ cat vm_yaml | grep spec -A 5 > spec: > domain: > cpu: > cores: 1 > model: host-model > devices: > > [akrgupta@fedora auth]$ virtctl start vm-fedora-hostmodel > VM vm-fedora-hostmodel was scheduled to start > [akrgupta@fedora auth]$ oc get vmi > NAME AGE PHASE IP NODENAME > READY > vm-fedora-hostmodel 73s Running 10.131.0.44 > virt-akr-49-pcdnh-worker-0-ft7rw True > [akrgupta@fedora auth]$ oc adm uncordon virt-akr-49-pcdnh-worker-0-ph99k > node/virt-akr-49-pcdnh-worker-0-ph99k uncordoned > [akrgupta@fedora auth]$ oc get nodes > NAME STATUS ROLES AGE > VERSION > virt-akr-49-pcdnh-master-0 Ready master 3h26m > v1.22.8+9e95cb9 > virt-akr-49-pcdnh-master-1 Ready master 3h26m > v1.22.8+9e95cb9 > virt-akr-49-pcdnh-master-2 Ready master 3h25m > v1.22.8+9e95cb9 > virt-akr-49-pcdnh-worker-0-ft7rw Ready worker 3h9m > v1.22.8+9e95cb9 > virt-akr-49-pcdnh-worker-0-ph99k Ready worker 3h7m > v1.22.8+9e95cb9 > virt-akr-49-pcdnh-worker-0-s7rqn Ready,SchedulingDisabled worker 3h9m > v1.22.8+9e95cb9 > [akrgupta@fedora auth]$ virtctl migrate vm-fedora-hostmodel > VM vm-fedora-hostmodel was scheduled to migrate > [akrgupta@fedora auth]$ oc get vmi > NAME AGE PHASE IP NODENAME > READY > vm-fedora-hostmodel 3m37s Running 10.129.2.65 > virt-akr-49-pcdnh-worker-0-ph99k True > [akrgupta@fedora auth]$ virtctl migrate vm-fedora-hostmodel > VM vm-fedora-hostmodel was scheduled to migrate > [akrgupta@fedora auth]$ oc get vmi > NAME AGE PHASE IP NODENAME > READY > vm-fedora-hostmodel 5m23s Running 10.131.0.46 > virt-akr-49-pcdnh-worker-0-ft7rw True > > We can migrate to newer target node and than return to the source node when > using host-model cpu Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Virtualization 4.9.6 Images security and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:6681 |
Description of problem: - We cant migrate to newer target node and than return to the source node when using host-model cpu. - vmi with host-model can't migrate even when it should Version-Release number of selected component (if applicable): How reproducible: for instance: (1) if vmi start with hosmodel-cpu in node01 that doesn't have AES feature and than migrate to node02 that has AES we won't be able to migrate back to node01. also (2) vmi with host-model can't migrate if the target node does't have the same host model even if the target node support the host-model of the source node and has all the required features. Steps to Reproduce (1): 1. Deploy kubevirt in heterogeneous Cluster that has a Node with unique feature (after deploying kubevirt you can use the following command to know which features exist in the node: `kubectl get node <node_name> -oyaml | grep host- model-required-features ` ) 2. Start a vm with host-model cpu in a node without any unique feature 3. migrate to a node with unique feature 4. try to migrate back to the inital node Actual results: the migration in step 4 will fail because of a node selector in virt-launcher that shouldn't be there. Steps to Reproduce (2): 1. Deploy kubevirt in heterogeneous Cluster with at least two nodes with diffrent host-model cpuModel. (after deploying kubevirt you can use the following command to know which host-model a node has: `kubectl get node <node_name> -oyaml | grep host-model-cpu.node ` ) 2. start a vm with host-model cpu in a node 3. try to migrate it to node that support the source node's host-model cpuModel but with different host-model Actual results: the migration will fail because of a node selector in virt-launcher. Expected results: The migration should Succeed Additional info: