Description ----------- Fast-forward upgrade from OSP-13 (RHEL-7.9) to OSP-16.2 (RHEL-8.3) fails[1] during live migration with: [...] libvirt.libvirtError: operation failed: guest CPU doesn't match specification: missing features: hle,rtm The failure is due to RHEL-8.3 (destination host) disabling an Intel "TSX". And disabling TSX disables the 'hle' and 'rtm' features. This was discovered during OSP fast-forward upgrades testing[+] where a guest was being live-migrated from RHEL-7.9 (with TSX=on) to RHEL-8.3 (breaking change: TSX=off), and the migration failed with the above-mentioned error. [+] https://bugzilla.redhat.com/show_bug.cgi?id=1921070#c14 — Live migration during OSP16.2 hybrid state from RHEL7.9 to RHEL8.3 not working Why? ---- RHEL-8.3 kernel disabled Intel TSX by default, because it is considered a potential security risk: https://bugzilla.redhat.com/show_bug.cgi?id=1828642 kernel: Disable Intel TSX by default on newer CPUs Still, it is not acceptable for RHEL-8.3 kernel to break user-space in a minor RHEL release. (See also: https://bugzilla.redhat.com/show_bug.cgi?id=1921070#c16) Workaround for OSP upgrades --------------------------- This is unpalatable, but unfortunately there's no other option currently: (1) have a TripleO config attribute that will enable TSX on the destination RHEL-8.3 host; set the following in /etc/default/grub: GRUB_CMDLINE_LINUX_DEFAULT="[...] tsx=on" ... and reboot the 8.3 host; (2) live-migrate the guests from RHEL-7.9 to the RHEL-8.3; (3) now turn off TSX on the RHEL-8.3 host kernel command-line; shutdown the guests; (4) reboot the 8.3 host again, and start the guests
Hi, I've tested the tsx=on flag during update from 16.1 to 16.2 according to https://access.redhat.com/node/6036141/ and this fail, see[1]. There is a reboot of the compute node that happen during update due to tripleo-ansible/.../kernelargs.yaml [2]. The workaround is to have: echo "#TRIPLEO_HEAT_TEMPLATE_KERNEL_ARGS" |sudo tee -a /etc/default/grub executed on every compute nodes before update. I've updated the kb according in [3], but this need to be reviewed and published. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1975240 [2] https://opendev.org/openstack/tripleo-ansible/src/branch/stable/train/tripleo_ansible/roles/tripleo-kernel/tasks/kernelargs.yml#L89-L103 [3] https://access.redhat.com/node/6036141/draft
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenStack Platform (RHOSP) 16.2 enhancement advisory), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2021:3483