Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 2107896

Summary: [RHOSP17.0] Missing Reboot To Apply Tuned Parameters To Kernel Boot Arguments
Product: Red Hat OpenStack Reporter: Vadim Khitrin <vkhitrin>
Component: tripleo-ansibleAssignee: Vijayalakshmi Candappa <vcandapp>
Status: CLOSED CURRENTRELEASE QA Contact: Joe H. Rahme <jhakimra>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 17.0 (Wallaby)CC: bshephar, gregraka, hakhande, igallagh, jamsmith, jparoly, jschluet, mburns, mgeary, oblaut, owalsh, rhayakaw, rheslop, sbaker, stchen, supadhya, vcandapp
Target Milestone: z1Keywords: AutomationBlocker, Bugfix, Regression, TestOnly, Triaged
Target Release: 17.0   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: tripleo-ansible-3.3.1-0.20221208161842.fa5422f.el9ost Doc Type: Known Issue
Doc Text:
There is currently a known issue that causes tuned kernel configurations to not be applied after initial provisioning. + Workaround: You can use the following custom playbook to ensure that the tuned kernel command line arguments are applied. Save the following playbook as `/usr/share/ansible/tripleo-playbooks/cli-overcloud-node-reset-blscfg.yaml` on the undercloud node: + ---- - name: Reset BLSCFG of compute node(s) meant for NFV deployments hosts: allovercloud any_errors_fatal: true gather_facts: true pre_tasks: - name: Wait for provisioned nodes to boot wait_for_connection: timeout: 600 delay: 10 tasks: - name: Reset BLSCFG flag in grub file, if it is enabled become: true lineinfile: path: /etc/default/grub line: "GRUB_ENABLE_BLSCFG=false" regexp: "^GRUB_ENABLE_BLSCFG=.*" insertafter: '^GRUB_DISABLE_RECOVERY.*' ---- + Configure the role in the node definition file, `overcloud-baremetal-deploy.yaml`, to run the `cli-overcloud-node-reset-blscfg.yaml` playbook before the playbook that sets the `kernelargs`: + ---- - name: ComputeOvsDpdkSriov count: 2 hostname_format: computeovsdpdksriov-%index% defaults: networks: - network: internal_api subnet: internal_api_subnet - network: tenant subnet: tenant_subnet - network: storage subnet: storage_subnet network_config: template: /home/stack/osp17_ref/nic-configs/computeovsdpdksriov.j2 config_drive: cloud_config: ssh_pwauth: true disable_root: false chpasswd: list: |- root:12345678 expire: False ansible_playbooks: - playbook: /usr/share/ansible/tripleo-playbooks/cli-overcloud-node-reset-blscfg.yaml - playbook: /usr/share/ansible/tripleo-playbooks/cli-overcloud-node-kernelargs.yaml extra_vars: reboot_wait_timeout: 600 kernel_args: 'default_hugepagesz=1GB hugepagesz=1G hugepages=32 iommu=pt intel_iommu=on isolcpus=1-11,13-23' tuned_profile: 'cpu-partitioning' tuned_isolated_cores: '1-11,13-23' - playbook: /usr/share/ansible/tripleo-playbooks/cli-overcloud-openvswitch-dpdk.yaml extra_vars: memory_channels: '4' lcore: '0,12' pmd: '1,13,2,14,3,15' socket_mem: '4096' disable_emc: false enable_tso: false revalidator: '' handler: '' pmd_auto_lb: false pmd_load_threshold: '' pmd_improvement_threshold: '' pmd_rebal_interval: '' nova_postcopy: true ----
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-04-05 10:34:53 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2122968    

Description Vadim Khitrin 2022-07-17 11:40:18 UTC
Description of problem:
This was observed in several DPDK based deployments (that use cpu-partitoning profile).

After provisioning baremetal nodes and applying tuned profile, an additional reboot to apply the generated tuned parameters is not occurring.
Content of /proc/cmdline before reboot:
BOOT_IMAGE=(lvmid/2HnUIZ-jxEy-VuLG-rwud-Js0d-bckB-OpxLmk/0dLWXT-8y2y-YIq1-o9G1-aHW2-fZiy-sE21Hf)/boot/vmlinuz-5.14.0-70.17.1.el9_0.x86_64 root=LABEL=img-rootfs ro console=ttyS0 console=ttyS0,115200n81 no_timer_check crashkernel=1G-4G:192M,4G-64G:256M,64G-:512M default_hugepagesz=1GB hugepagesz=1G hugepages=64 iommu=pt intel_iommu=on tsx=off isolcpus=2-19,22-39 console=tty0 console=ttyS0,115200 no_timer_check nofb nomodeset vga=normal console=tty0 console=ttyS0,115200 audit=1 nousb
We can see that tuned was activated and generated/updated files:
[root@computeovsdpdksriov-1 ~]# ls /boot/tuned-initrd.img
/boot/tuned-initrd.img
[root@computeovsdpdksriov-1 ~]# grep -iR 'tuned' /boot/grub2/grub.cfg
### BEGIN /etc/grub.d/00_tuned ###
set tuned_params="skew_tick=1 nohz=on nohz_full=2-19,22-39 rcu_nocbs=2-19,22-39 tuned.non_isolcpus=00300003 intel_pstate=disable nosoftlockup"
set tuned_initrd="(lvmid/2HnUIZ-jxEy-VuLG-rwud-Js0d-bckB-OpxLmk/0dLWXT-8y2y-YIq1-o9G1-aHW2-fZiy-sE21Hf)/boot/tuned-initrd.img"
### END /etc/grub.d/00_tuned ###
[root@computeovsdpdksriov-1 ~]# grep -iR 'tuned' /boot/grub2/grubenv
tuned_params=skew_tick=1 nohz=on nohz_full=2-19,22-39 rcu_nocbs=2-19,22-39 tuned.non_isolcpus=00300003 intel_pstate=disable nosoftlockup
tuned_initrd=(lvmid/2HnUIZ-jxEy-VuLG-rwud-Js0d-bckB-OpxLmk/0dLWXT-8y2y-YIq1-o9G1-aHW2-fZiy-sE21Hf)/boot/tuned-initrd.img

After performing a simple reboot (without regenerating initramfs), all the required kernel boot arguments are applied:
BOOT_IMAGE=(lvmid/2HnUIZ-jxEy-VuLG-rwud-Js0d-bckB-OpxLmk/0dLWXT-8y2y-YIq1-o9G1-aHW2-fZiy-sE21Hf)/boot/vmlinuz-5.14.0-70.17.1.el9_0.x86_64 root=LABEL=img-rootfs ro console=ttyS0 console=ttyS0,115200n81 no_timer_check crashkernel=1G-4G:192M,4G-64G:256M,64G-:512M default_hugepagesz=1GB hugepagesz=1G hugepages=64 iommu=pt intel_iommu=on tsx=off isolcpus=2-19,22-39 console=tty0 console=ttyS0,115200 no_timer_check nofb nomodeset vga=normal console=tty0 console=ttyS0,115200 audit=1 nousb skew_tick=1 nohz=on nohz_full=2-19,22-39 rcu_nocbs=2-19,22-39 tuned.non_isolcpus=00300003 intel_pstate=disable nosoftlockup

Version-Release number of selected component (if applicable):
Compose: RHOS-17.0-RHEL-9-20220711.n.1
puppet-tripleo-14.2.3-0.20220706000441.41752a3.el9ost.noarch
ansible-tripleo-ipsec-11.0.1-0.20210910011424.b5559c8.el9ost.noarch
ansible-tripleo-ipa-0.2.3-0.20220301190449.6b0ed82.el9ost.noarch
ansible-role-tripleo-modify-image-1.3.1-0.20220216001439.30d23d5.el9ost.noarch
python3-tripleo-common-15.4.1-0.20220705010405.51f6577.el9ost.noarch
tripleo-ansible-3.3.1-0.20220706140824.fa5422f.el9ost.noarch
openstack-tripleo-validations-14.2.2-0.20220701081823.37bfae3.el9ost.noarch
openstack-tripleo-common-containers-15.4.1-0.20220705010405.51f6577.el9ost.noarch
openstack-tripleo-common-15.4.1-0.20220705010405.51f6577.el9ost.noarch
openstack-tripleo-heat-templates-14.3.1-0.20220706080800.feca772.el9ost.noarch
python3-tripleoclient-16.4.1-0.20220705111519.23dbe54.el9ost.noarch

How reproducible:
In all of our DPDK based environments.

Steps to Reproduce:
1. Deploy product (DPDK environment).
2. Check kernel boot arguments on compute nodes.

Actual results:
Tuned parameters are not applied in kernel boot arguments.

Expected results:
Tuned parameters are applied in kernel boot arguments.

Additional info:
We are unsure if we missed something in our templates or this is a missing reboot post/during baremetal provisioning.

Will share templates and installation logs.

Comment 42 Vadim Khitrin 2023-03-30 10:47:31 UTC
Patch is present in 17.0.1.

In 17.1 a reboot is also not required, marking as VERIFIED.

Comment 43 Lon Hohberger 2023-04-05 10:34:53 UTC
According to our records, this should be resolved by tripleo-ansible-3.3.1-0.20221208161843.fa5422f.el9ost.  This build is available now.