Created attachment 1739366 [details] host with 8.2 os Description of problem: After upgrade host with CentOS 8.3 i'm unable to migrate my guest from a 8.2 host to 8.3 newly updated. If i do a full power off / power on cycle of the guest, it can be migrated wherever i want. Version-Release number of selected component (if applicable): vdsm.x86_64 4.40.35.1-1 ovirt-engine.noarch 4.4.3.12-1 Cluster CPU type : Secure Intel Cascadelake Server Family How reproducible: Upgrade nodes of cluster to centos 8.3 Steps to Reproduce: 1. Get a cluster (cascalake server family) with host rhel 8.2 2. Upgrade one node to 8.3 reboot and activate id 3. Trying to migrate guest on it. Actual results: Migration failed with error : guest CPU doesn't match specification: missing features: tsx-ctrl (migration:294) Expected results: Migration run fine Additional info:
Created attachment 1739367 [details] node with 8.3 os
The documentation text flag should only be set after 'doc text' field is provided. Please provide the documentation text and set the flag to '?' again.
That's important because VMs that were started before the cluster settings were updated to the ones introduced in ovirt 4.4.3 cannot migrate to centos 8.3 hosts. We discussed in a separate context that we probably can't change much in the CPU settings of the VM in the destination host, but need to check if we can drop that +tsx-ctrl on VMs that migrate to a centos/rhel 8.3 or what's the alternative way to enable such a migration.
> but need to check if we can drop that +tsx-ctrl on VMs that migrate to a centos/rhel 8.3 I don't have a machine with TSX around, but I tried to remove another CPU feature in the libvirt hook on the destination (both the source and destination are 8.3) and libvirt is not happy about it: libvirtd[1579]: unsupported configuration: Target CPU feature count 2 does not match source 3 So this is clearly not going to work. > or what's the alternative way to enable such a migration We will have to find some.
Apparently the only, but crucial, problem is tsx-ctrl feature presence. The feature doesn't make sense when TSX is disabled but it may still modify guests in some way. As discussed with Jiří Denemark, libvirt cannot help us with the feature removal, whether the feature modifies guests under given circumstances or not. If removal of the feature while the guest is running would be harmless to the guest then QEMU could be modified not to fail on tsx-ctrl feature request when TSX is disabled (which is the case, since `rtm' and `hle' features are disabled). Arik, do we want to file a QEMU bug for requesting such a change and discussing whether it is possible? The only other options are to instruct users either to restart the VMs or to enable TSX on the destination hosts (with all the implications regarding security and performance). Note that we have the same problem with file migrations; in theory there is an additional danger with them that a VM is suspended and once there is no 8.2 host present then the VM can no longer be resumed and must be powered off.
(In reply to Milan Zamazal from comment #5) > As discussed with Jiří Denemark, libvirt cannot help us with the feature > removal, whether the feature modifies guests under given circumstances or > not. If removal of the feature while the guest is running would be harmless > to the guest then QEMU could be modified not to fail on tsx-ctrl feature > request when TSX is disabled (which is the case, since `rtm' and `hle' > features are disabled). Arik, do we want to file a QEMU bug for requesting > such a change and discussing whether it is possible? Yes please, I think it will significantly simplify the upgrade process. > The only other options are to instruct users either to restart the VMs or to > enable TSX on the destination hosts (with all the implications regarding > security and performance). Note that we have the same problem with file > migrations; in theory there is an additional danger with them that a VM is > suspended and once there is no 8.2 host present then the VM can no longer be > resumed and must be powered off. I don't think that enabling TSX on the destination hosts would be recommended - but worth documenting that issue and that restarting the VMs can solve this. I'll create a documentation bug.
A platform bug filed: https://bugzilla.redhat.com/1912448
Test Versions: ovirt-engine-4.4.2.6-0.2.el8ev.noarch ovirt-engine-4.4.5.11-0.1.el8ev.noarch rhel 8.2 host: - kernel-4.18.0-193.19.1.el8_2.x86_64 - vdsm-4.40.26.3-1.el8ev.x86_64 rhel 8.3 host: - kernel-4.18.0-240.22.1.el8_3.x86_64 - vdsm-4.40.50.10-1.el8ev.x86_64 Test Steps: 1. Set up 4.4.2 engine 2. Create 4.4 Data Center, add a cluster with Secure Intel Cascadelake Server Family cpu type 3. Add a rhel 8.2 host with kernel-4.18.0-193.19.1.el8_2.x86_64 # virsh domcapabilities <cpu> <mode name='host-passthrough' supported='yes'/> <mode name='host-model' supported='yes'> <model fallback='forbid'>Cascadelake-Server</model> <vendor>Intel</vendor> <feature policy='require' name='ss'/> <feature policy='require' name='vmx'/> <feature policy='require' name='hypervisor'/> <feature policy='require' name='tsc_adjust'/> <feature policy='require' name='umip'/> <feature policy='require' name='pku'/> <feature policy='require' name='md-clear'/> <feature policy='require' name='stibp'/> <feature policy='require' name='arch-capabilities'/> <feature policy='require' name='xsaves'/> <feature policy='require' name='invtsc'/> <feature policy='require' name='ibpb'/> <feature policy='require' name='amd-ssbd'/> <feature policy='require' name='rdctl-no'/> <feature policy='require' name='ibrs-all'/> <feature policy='require' name='skip-l1dfl-vmentry'/> <feature policy='require' name='mds-no'/> <feature policy='require' name='pschange-mc-no'/> <feature policy='require' name='tsx-ctrl'/> </mode> 4. Create and run a VM # virsh -r dumpxml vm_82 <cpu mode='custom' match='exact' check='full'> <model fallback='forbid'>Cascadelake-Server</model> <topology sockets='16' dies='1' cores='1' threads='1'/> <feature policy='require' name='md-clear'/> <feature policy='require' name='mds-no'/> <feature policy='disable' name='hle'/> <feature policy='disable' name='rtm'/> <feature policy='require' name='tsx-ctrl'/> <feature policy='require' name='arch-capabilities'/> <feature policy='require' name='hypervisor'/> <feature policy='disable' name='mpx'/> <feature policy='require' name='pku'/> 5. Upgrade engine to 4.4.5 6. Add a rhel 8.3 host with kernel-4.18.0-240.22.1.el8_3.x86_64 # virsh domcapabilities <cpu> <mode name='host-passthrough' supported='yes'> <enum name='hostPassthroughMigratable'> <value>on</value> <value>off</value> </enum> </mode> <mode name='host-model' supported='yes'> <model fallback='forbid'>Cascadelake-Server</model> <vendor>Intel</vendor> <feature policy='require' name='ss'/> <feature policy='require' name='vmx'/> <feature policy='require' name='hypervisor'/> <feature policy='require' name='tsc_adjust'/> <feature policy='require' name='umip'/> <feature policy='require' name='pku'/> <feature policy='require' name='md-clear'/> <feature policy='require' name='stibp'/> <feature policy='require' name='arch-capabilities'/> <feature policy='require' name='xsaves'/> <feature policy='require' name='invtsc'/> <feature policy='require' name='ibpb'/> <feature policy='require' name='amd-stibp'/> <feature policy='require' name='amd-ssbd'/> <feature policy='require' name='rdctl-no'/> <feature policy='require' name='ibrs-all'/> <feature policy='require' name='skip-l1dfl-vmentry'/> <feature policy='require' name='mds-no'/> <feature policy='require' name='pschange-mc-no'/> <feature policy='require' name='tsx-ctrl'/> <feature policy='disable' name='hle'/> <feature policy='disable' name='rtm'/> </mode> 7. Migrate the VM from rhel 8.2 host to rhel 8.3 host 2021-03-30 06:57:25,491+03 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-17) [5d162d86] EVENT_ID: VM_MIGRATION_DONE(63), Migration completed (VM: vm_82, Source: host_82, Destination: host_83, Duration: 3 seconds, Total: 3 seconds, Actual downtime: (N/A)) VM with Cascadelake-Server,-hle,-rtm,+tsx-ctrl cpu configuration can be migrated from rhel 8.2 host to rhel 8.3 host with kernel-4.18.0-240.22.1.el8_3.x86_64.