Bug 2073101

Summary: [OSP 17.0] THT KernelArgs not getting set on UEFI systems
Product: Red Hat OpenStack Reporter: James Parker <jparker>
Component: tripleo-ansibleAssignee: OSP Team <rhos-maint>
Status: CLOSED DUPLICATE QA Contact: Joe H. Rahme <jhakimra>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 17.0 (Wallaby)CC: bdobreli, bshephar, pbabbar, sbaker
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-04-13 15:07:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description James Parker 2022-04-07 15:48:48 UTC
Description of problem:

This appears to be a similar issue to the one described in [1,2], but I do see the new ansible tasks [3] being executed in my 17 deployment but the THT parameters are not reflected in /proc/cmdline.

####################
# Overcloud deploy #
####################
(undercloud) [stack@undercloud-0 ~]$ cat overcloud_deploy.sh 
#!/bin/bash

THT_PATH='/home/stack/titan10_17.0_mixed_feature_deployment_files'

if [[ ! -f "$THT_PATH/roles_data.yaml" ]]; then
  openstack overcloud roles generate -o $THT_PATH/roles_data.yaml Controller ComputeSriov
fi

openstack -vvv overcloud deploy \
--timeout 100 \
--templates /usr/share/openstack-tripleo-heat-templates \
--libvirt-type kvm \
--stack overcloud \
--deployed-server \
-e /home/stack/templates/overcloud-vip-deployed.yaml \
-e /home/stack/templates/overcloud-networks-deployed.yaml \
-e /home/stack/templates/overcloud-baremetal-deployed.yaml \
--ntp-server clock.redhat.com \
--networks-file $THT_PATH/network/network_data_v2.yaml \
--disable-protected-resource-types \
-e /usr/share/openstack-tripleo-heat-templates/environments/services/neutron-ovn-dpdk.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/services/neutron-ovn-sriov.yaml \
-e /home/stack/containers-prepare-parameter.yaml \
-r $THT_PATH/roles_data.yaml \
-e $THT_PATH/network-environment-overrides.yaml \
-e $THT_PATH/gpu.yaml \
-e $THT_PATH/nodes_data.yaml \
-e $THT_PATH/per_node_hieradata.yaml \
--log-file overcloud_install.log &> overcloud_install.log

###########################
# Relevant THT parameters #
###########################
(undercloud) [stack@undercloud-0 ~]$ grep Kernel titan10_17.0_mixed_feature_deployment_files/network-environment-overrides.yaml 
    KernelArgs: "default_hugepagesz=1GB hugepagesz=1G hugepages=32 iommu=pt intel_iommu=on isolcpus=2,3,4,5,6,7,8,9,10,11,12,13"

################################
# Present in /etc/default/grub #
################################
[heat-admin@computesriov-1 ~]$ sudo grep KERNEL_ARG /etc/default/grub
GRUB_TRIPLEO_HEAT_TEMPLATE_KERNEL_ARGS=" default_hugepagesz=1GB hugepagesz=1G hugepages=32 iommu=pt intel_iommu=on isolcpus=2,3,4,5,6,7,8,9,10,11,12,13 "
GRUB_CMDLINE_LINUX="${GRUB_CMDLINE_LINUX:+$GRUB_CMDLINE_LINUX }${GRUB_TRIPLEO_HEAT_TEMPLATE_KERNEL_ARGS}"

##############################
# Not found in /proc/cmdline #
##############################
[heat-admin@computesriov-1 ~]$ cat /proc/cmdline 
BOOT_IMAGE=(lvmid/2CjQYD-AyLy-vFfp-fn9F-CK5p-Xnz6-s4xWac/QMfrKG-g4Mv-YiYk-2swb-0OYp-KzGR-zbTPCd)/boot/vmlinuz-5.14.0-63.el9.x86_64 root=LABEL=img-rootfs ro console=ttyS0 console=ttyS0,115200n81 no_timer_check crashkernel=1G-4G:192M,4G-64G:256M,64G-:512M console=tty0 console=ttyS0,115200 no_timer_check nofb nomodeset vga=normal console=tty0 console=ttyS0,115200 audit=1 nousb

#########################################
# THT params present in redhat and BOOT #
#########################################
[heat-admin@computesriov-1 ~]$ sudo grep iommu /boot/efi/EFI/redhat/grub.cfg
  set kernelopts="root=LABEL=img-rootfs ro console=ttyS0 console=ttyS0,115200n81 no_timer_check  crashkernel=1G-4G:192M,4G-64G:256M,64G-:512M  default_hugepagesz=1GB hugepagesz=1G hugepages=32 iommu=pt intel_iommu=on isolcpus=2,3,4,5,6,7,8,9,10,11,12,13  console=tty0 console=ttyS0,115200 no_timer_check nofb nomodeset vga=normal console=tty0 console=ttyS0,115200 audit=1 nousb"

[heat-admin@computesriov-1 ~]$ sudo grep iommu /boot/efi/EFI/BOOT/grub.cfg 
  set kernelopts="root=LABEL=img-rootfs ro console=ttyS0 console=ttyS0,115200n81 no_timer_check  crashkernel=1G-4G:192M,4G-64G:256M,64G-:512M  default_hugepagesz=1GB hugepagesz=1G hugepages=32 iommu=pt intel_iommu=on isolcpus=2,3,4,5,6,7,8,9,10,11,12,13  console=tty0 console=ttyS0,115200 no_timer_check nofb nomodeset vga=normal console=tty0 console=ttyS0,115200 audit=1 nousb"


######################################
# Not present in /boot/grub2/grubenv #
######################################
[heat-admin@computesriov-1 ~]$ sudo grep iommu /boot/grub2/grubenv
[heat-admin@computesriov-1 ~]$ 

As mentioned before I do see the new changes from [3] being executed as well in the deployment:
(undercloud) [stack@undercloud-0 ~]$ grep -r "CHANGED | Generate EFI grub config | computesriov-" overcloud_install.log 
2022-04-06 21:36:37.876408 | 525400f6-db07-dd90-e7f4-000000000a1d |    CHANGED | Generate EFI grub config | computesriov-1 | item={'changed': False, 'stat': {'exists': True, 'path': '/boot/efi/EFI/BOOT', 'mode': '0755', 'isdir': True, 'ischr': False, 'isblk': False, 'isreg': False, 'isfifo': False, 'islnk': False, 'issock': False, 'uid': 0, 'gid': 0, 'size': 2048, 'inode': 114, 'dev': 2049, 'nlink': 2, 'atime': 1647388800.0, 'mtime': 1647443838.0, 'ctime': 1647443838.77, 'wusr': True, 'rusr': True, 'xusr': True, 'wgrp': False, 'rgrp': True, 'xgrp': True, 'woth': False, 'roth': True, 'xoth': True, 'isuid': False, 'isgid': False, 'blocks': 4, 'block_size': 2048, 'device_type': 0, 'readable': True, 'writeable': True, 'executable': True, 'pw_name': 'root', 'gr_name': 'root', 'mimetype': 'inode/directory', 'charset': 'binary', 'version': None, 'attributes': [], 'attr_flags': ''}, 'invocation': {'module_args': {'path': '/boot/efi/EFI/BOOT', 'follow': False, 'get_md5': False, 'get_checksum': True, 'get_mime': True, 'get_attributes': True, 'checksum_algorithm': 'sha1'}}, 'failed': False, 'item': '/boot/efi/EFI/BOOT', 'ansible_loop_var': 'item'} | result={
2022-04-06 21:36:38.653215 | 525400f6-db07-dd90-e7f4-000000000a1d |    CHANGED | Generate EFI grub config | computesriov-0 | item={'changed': False, 'stat': {'exists': True, 'path': '/boot/efi/EFI/BOOT', 'mode': '0755', 'isdir': True, 'ischr': False, 'isblk': False, 'isreg': False, 'isfifo': False, 'islnk': False, 'issock': False, 'uid': 0, 'gid': 0, 'size': 2048, 'inode': 114, 'dev': 2049, 'nlink': 2, 'atime': 1647388800.0, 'mtime': 1647443838.0, 'ctime': 1647443838.77, 'wusr': True, 'rusr': True, 'xusr': True, 'wgrp': False, 'rgrp': True, 'xgrp': True, 'woth': False, 'roth': True, 'xoth': True, 'isuid': False, 'isgid': False, 'blocks': 4, 'block_size': 2048, 'device_type': 0, 'readable': True, 'writeable': True, 'executable': True, 'pw_name': 'root', 'gr_name': 'root', 'mimetype': 'inode/directory', 'charset': 'binary', 'version': None, 'attributes': [], 'attr_flags': ''}, 'invocation': {'module_args': {'path': '/boot/efi/EFI/BOOT', 'follow': False, 'get_md5': False, 'get_checksum': True, 'get_mime': True, 'get_attributes': True, 'checksum_algorithm': 'sha1'}}, 'failed': False, 'item': '/boot/efi/EFI/BOOT', 'ansible_loop_var': 'item'} | result={
2022-04-06 21:36:42.551938 | 525400f6-db07-dd90-e7f4-000000000a1d |    CHANGED | Generate EFI grub config | computesriov-1 | item={'changed': False, 'stat': {'exists': True, 'path': '/boot/efi/EFI/redhat', 'mode': '0755', 'isdir': True, 'ischr': False, 'isblk': False, 'isreg': False, 'isfifo': False, 'islnk': False, 'issock': False, 'uid': 0, 'gid': 0, 'size': 2048, 'inode': 115, 'dev': 2049, 'nlink': 2, 'atime': 1647388800.0, 'mtime': 1647443848.0, 'ctime': 1647443848.99, 'wusr': True, 'rusr': True, 'xusr': True, 'wgrp': False, 'rgrp': True, 'xgrp': True, 'woth': False, 'roth': True, 'xoth': True, 'isuid': False, 'isgid': False, 'blocks': 4, 'block_size': 2048, 'device_type': 0, 'readable': True, 'writeable': True, 'executable': True, 'pw_name': 'root', 'gr_name': 'root', 'mimetype': 'inode/directory', 'charset': 'binary', 'version': None, 'attributes': [], 'attr_flags': ''}, 'invocation': {'module_args': {'path': '/boot/efi/EFI/redhat', 'follow': False, 'get_md5': False, 'get_checksum': True, 'get_mime': True, 'get_attributes': True, 'checksum_algorithm': 'sha1'}}, 'failed': False, 'item': '/boot/efi/EFI/redhat', 'ansible_loop_var': 'item'} | result={
2022-04-06 21:36:43.294083 | 525400f6-db07-dd90-e7f4-000000000a1d |    CHANGED | Generate EFI grub config | computesriov-0 | item={'changed': False, 'stat': {'exists': True, 'path': '/boot/efi/EFI/redhat', 'mode': '0755', 'isdir': True, 'ischr': False, 'isblk': False, 'isreg': False, 'isfifo': False, 'islnk': False, 'issock': False, 'uid': 0, 'gid': 0, 'size': 2048, 'inode': 115, 'dev': 2049, 'nlink': 2, 'atime': 1647388800.0, 'mtime': 1647443848.0, 'ctime': 1647443848.99, 'wusr': True, 'rusr': True, 'xusr': True, 'wgrp': False, 'rgrp': True, 'xgrp': True, 'woth': False, 'roth': True, 'xoth': True, 'isuid': False, 'isgid': False, 'blocks': 4, 'block_size': 2048, 'device_type': 0, 'readable': True, 'writeable': True, 'executable': True, 'pw_name': 'root', 'gr_name': 'root', 'mimetype': 'inode/directory', 'charset': 'binary', 'version': None, 'attributes': [], 'attr_flags': ''}, 'invocation': {'module_args': {'path': '/boot/efi/EFI/redhat', 'follow': False, 'get_md5': False, 'get_checksum': True, 'get_mime': True, 'get_attributes': True, 'checksum_algorithm': 'sha1'}}, 'failed': False, 'item': '/boot/efi/EFI/redhat', 'ansible_loop_var': 'item'} | result={

Is there anything missing with the deployment that I am not accounting for?  Enviroment can be provided if necessary.

Version-Release number of selected component (if applicable):
(undercloud) [stack@undercloud-0 ~]$ cat core_puddle_version 
RHOS-17.0-RHEL-9-20220316.n.1

How reproducible:
100%

Steps to Reproduce:
1. Deploy a UEFI environment with KernelArgs set
2.
3.

Actual results:
KernelArgs are not reflected on computes

Expected results:
KernelArgs should be applied to the computes


Additional info:

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1974507
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1987092
[3] https://review.opendev.org/c/openstack/tripleo-ansible/+/799132/2/tripleo_ansible/roles/tripleo_kernel/tasks/kernelargs.yml#55

Comment 1 Brendan Shephard 2022-04-08 01:17:14 UTC
Hi,

So you showed that it isn't present in /proc/cmdline, but is in the grub.cfg file [1]. Has this system been rebooted? It won't appear in /proc/cmdline until it has been rebooted with those kernel args.

[1]
"""
##############################
# Not found in /proc/cmdline #
##############################
[heat-admin@computesriov-1 ~]$ cat /proc/cmdline 
BOOT_IMAGE=(lvmid/2CjQYD-AyLy-vFfp-fn9F-CK5p-Xnz6-s4xWac/QMfrKG-g4Mv-YiYk-2swb-0OYp-KzGR-zbTPCd)/boot/vmlinuz-5.14.0-63.el9.x86_64 root=LABEL=img-rootfs ro console=ttyS0 console=ttyS0,115200n81 no_timer_check crashkernel=1G-4G:192M,4G-64G:256M,64G-:512M console=tty0 console=ttyS0,115200 no_timer_check nofb nomodeset vga=normal console=tty0 console=ttyS0,115200 audit=1 nousb

#########################################
# THT params present in redhat and BOOT #
#########################################
[heat-admin@computesriov-1 ~]$ sudo grep iommu /boot/efi/EFI/redhat/grub.cfg
  set kernelopts="root=LABEL=img-rootfs ro console=ttyS0 console=ttyS0,115200n81 no_timer_check  crashkernel=1G-4G:192M,4G-64G:256M,64G-:512M  default_hugepagesz=1GB hugepagesz=1G hugepages=32 iommu=pt intel_iommu=on isolcpus=2,3,4,5,6,7,8,9,10,11,12,13  console=tty0 console=ttyS0,115200 no_timer_check nofb nomodeset vga=normal console=tty0 console=ttyS0,115200 audit=1 nousb"

[heat-admin@computesriov-1 ~]$ sudo grep iommu /boot/efi/EFI/BOOT/grub.cfg 
  set kernelopts="root=LABEL=img-rootfs ro console=ttyS0 console=ttyS0,115200n81 no_timer_check  crashkernel=1G-4G:192M,4G-64G:256M,64G-:512M  default_hugepagesz=1GB hugepagesz=1G hugepages=32 iommu=pt intel_iommu=on isolcpus=2,3,4,5,6,7,8,9,10,11,12,13  console=tty0 console=ttyS0,115200 no_timer_check nofb nomodeset vga=normal console=tty0 console=ttyS0,115200 audit=1 nousb"
"""

Maybe the issue you're reporting is that the node wasn't automatically rebooted?

Comment 2 Brendan Shephard 2022-04-08 03:20:19 UTC
Was this block skipped in your Ansible log?
https://github.com/openstack/tripleo-ansible/blob/master/tripleo_ansible/roles/tripleo_kernel/tasks/kernelargs.yml#L196

It should automatically reboot the nodes after making these changes unless those conditions aren't met:
https://github.com/openstack/tripleo-ansible/blob/master/tripleo_ansible/roles/tripleo_kernel/tasks/kernelargs.yml#L198-L200


I would first see if rebooting it results in the args being added to /proc/cmdline. Then check the Ansible log to see if the node skipped the reboot step. Maybe there is something set in the templates to defer reboot. Like KernelArgsDeferReboot for example:
https://github.com/openstack/tripleo-heat-templates/blob/stable/wallaby/deployment/kernel/kernel-boot-params-baremetal-ansible.yaml#L38-L50

Comment 4 Steve Baker 2022-04-12 20:06:57 UTC
I think this is a duplicate of bug #2035325, the existing /boot/loader/entries file values are not being pulled in because the filenames don't match the new /etc/machine-id. I've got a series of changes to fix this:
https://review.opendev.org/c/openstack/diskimage-builder/+/837251 
https://review.opendev.org/c/openstack/tripleo-image-elements/+/837430 
https://review.opendev.org/c/openstack/tripleo-common/+/837431

You could re-attempt the manual grub2-mkconfig calls in comment #3 after manually renaming the /boot/loader/entries files to start with the value of /etc/machine-id. If that works then it is definitely a duplicate.

Comment 5 Bogdan Dobrelya 2022-04-13 07:35:19 UTC
James would you please re-verify with the patches mentioned above?

Comment 7 James Parker 2022-04-13 15:07:37 UTC

*** This bug has been marked as a duplicate of bug 2035325 ***