Bug 2122949 - RealTime KVM deployment fails for RHOSP17.0
Summary: RealTime KVM deployment fails for RHOSP17.0
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: tripleo-ansible
Version: 17.0 (Wallaby)
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: z1
: 17.0
Assignee: Steve Baker
QA Contact: Joe H. Rahme
URL:
Whiteboard:
: 2139922 (view as bug list)
Depends On:
Blocks: 2122968
TreeView+ depends on / blocked
 
Reported: 2022-08-31 11:52 UTC by Ofer Blaut
Modified: 2023-09-19 04:25 UTC (History)
10 users (show)

Fixed In Version: tripleo-ansible-3.3.1-0.20221123230736.fa5422f.el9ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-01-25 12:28:51 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Playbook to install kernel-rt and set kernel args in a single reboot (2.31 KB, text/plain)
2022-09-02 01:26 UTC, Steve Baker
no flags Details


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 864793 0 None master: MERGED tripleo-ansible: Ensure $tuned_params is restored after grub2-mkconfig (I6edef63e60623ea3b51386593d76b18092dd7f30) 2022-12-07 13:47:45 UTC
Red Hat Issue Tracker NFV-2624 0 None None None 2022-08-31 12:29:22 UTC
Red Hat Issue Tracker OSP-18484 0 None None None 2022-08-31 11:56:46 UTC
Red Hat Product Errata RHBA-2023:0271 0 None None None 2023-01-25 12:29:08 UTC

Description Ofer Blaut 2022-08-31 11:52:46 UTC
Description of problem:


I am sharing an issue we have observed in our attempts to work on RealTime deployment for RHOSP17.0,
So far all of our deployments (NFV QE) have been using UEFI boot mode and using the new `Overcloud UEFI hardened` image. We have also tried manually to use the `Overcloud full` image for the RealTime scenario.
As always, we are following as close to the documentation as possible [0]
The following observations were made:
1. Using UEFI hardened image, we noticed that the operating system fails to boot. When overcloud node provisioning occurs it fails on Ansible task `Reboot after kernel args update` [1] with a timeout. Accessing the console of the baremetal node and redirecting console output to `tty0` we can see the following during boot [2]. Here is the list of dracut related RPMs that are present in the image [3]. Also important to note that after this modification I was not able to use the previous kernel shipped inside the image (non realtime). SELinux relabeling was done during every step of modification and post `virt-sysprep`.
2. Using `Overcloud full` image with modifications, we were able to pass the failure but the operating system was not launched with the RealTime kernel (we ensured that it is configured to be the default by executing `grubby --default-kernel`. Provisioning of the nodes failed on a different error but this might not be related to the image modification.

Due to capacity and time constraints, this is the best we could do and report right now. We can not continue focusing on this due to a higher priority of other work related to RHOSP17.0.
Potentially we should ask RHEL QE regarding the rt kernel, perhaps we are missing something.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 7 Steve Baker 2022-09-02 01:26:46 UTC
Created attachment 1909131 [details]
Playbook to install kernel-rt and set kernel args in a single reboot

This playbook is an alternative to cli-overcloud-node-kernelargs.yaml which also installs kernel-rt and kernel-rt-kvm. It has new vars which must be supplied by the caller:
    reg_user
    reg_password
    reg_pool_id
    reg_repo_id

It relies on the kernel arguments also changing to trigger a reboot, otherwise a reboot needs to be triggered to run the new kernel.

Comment 25 Steve Baker 2022-11-03 21:47:28 UTC
I have proposed bug #2139922 to solve this issue for 17.1

Comment 26 Steve Baker 2022-11-16 21:53:57 UTC
*** Bug 2139922 has been marked as a duplicate of this bug. ***

Comment 27 Steve Baker 2022-11-16 21:59:08 UTC
I've proposed a fix for this which doesn't justify an RFE, and we may be able to get it into 17.0 z1.

Comment 28 Steve Baker 2022-11-24 01:32:03 UTC
The fix will ensure $tuned_params remains on the options line after the kernelargs playbook as been run. In addition to this the documentation for customizing an image to install kernel-rt needs to include the following steps:

# Set up repos for installing packages

# Install packages for realtime
virt-customize -a overcloud-hardened-uefi-full.qcow2 --install kernel-rt,kernel-rt-kvm,tuned-profiles-nfv-host --selinux-relabel

# Update packages inside the image
virt-customize -a overcloud-hardened-uefi-full.qcow2 --update --memsize 4721 --smp 2 --selinux-relabel

# Set the correct root device
virt-customize -v -x -a overcloud-hardened-uefi-full.qcow2 --run-command "find /boot/loader/entries/ -name '*.conf' -exec sed -i 's/root=UUID=[^ ]*/root=LABEL=img-rootfs/' {} \;" --selinux-relabel

# Set rt kernel as default
virt-customize -a overcloud-hardened-uefi-full.qcow2 --run-command "grubby --set-default /boot/vmlinuz*rt* && cp /boot/grub2/grubenv /boot/efi/EFI/redhat/grubenv" --selinux-relabel

Comment 44 errata-xmlrpc 2023-01-25 12:28:51 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 17.0.1 bug fix and enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:0271

Comment 45 Red Hat Bugzilla 2023-09-19 04:25:25 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days


Note You need to log in before you can comment on or make changes to this bug.