Bug 2043588 - [RFE] THT parameters to allow for multiple hugepage sizes per compute
Summary: [RFE] THT parameters to allow for multiple hugepage sizes per compute
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 18.0 (Zed)
Hardware: Unspecified
OS: Unspecified
low
medium
Target Milestone: ga
: ---
Assignee: OSP Team
QA Contact: Joe H. Rahme
URL:
Whiteboard:
Depends On:
Blocks: 2076498
TreeView+ depends on / blocked
 
Reported: 2022-01-21 15:10 UTC by James Parker
Modified: 2023-09-07 19:52 UTC (History)
7 users (show)

Fixed In Version: openstack-tripleo-heat-templates-14.3.1-0.20220607161058.ced328c.el9ost tripleo-ansible-3.3.1-0.20220607162207.ae139c3.el9ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 2049796 2076498 (view as bug list)
Environment:
Last Closed: 2023-09-07 19:52:02 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 828776 0 None master: MERGED tripleo-ansible: Adding tripleo_kernel_hugepages (I64eb1daad095a989a86b45b98ea10276bd0e0a9a) 2022-06-13 19:46:03 UTC
OpenStack gerrit 828782 0 None master: MERGED tripleo-heat-templates: Adding Hugepages role parameter (I1e05a5ea17c858a86acc170cfb91288884664b05) 2022-06-13 19:46:08 UTC
Red Hat Issue Tracker OSP-12241 0 None None None 2022-01-21 15:14:30 UTC

Description James Parker 2022-01-21 15:10:35 UTC
Description of problem:
Currently it is possible to configure multiple hugepage sizes in openstack per each compute via the THT parameter KernelArgs e.g.:

KernelArgs: "default_hugepagesz=1G hugepagesz=1G hugepages=4 hugepagesz=2M hugepages=1024 iommu=pt intel_iommu=on"

While this results in 1G and 2M both being configured on the compute, which will be reported to libvirt, the only mount setup will be for the default hugepages.

[root@computeamdsev-0 openstack]# mount | grep hugepages
hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,seclabel,pagesize=1024M)

Without another mount for the non-default hugepages it is only possible to to schedule guests with 1G hugepages in their respective flavor.

This type of deployment could benefit ovs-dpdk environments where operators want to deploy guests with a different hugepage size in the flavor versus the hugepage size allocated to ovs-dpdk.

Comment 5 David Vallee Delisle 2022-01-31 15:58:43 UTC
So as far as doc is concerned, this is how it can be done [1] using firstboot.

I still believe the HP process could benefit from some automation because we can qualify the current process, at best, as being hacky. I mean, we need to define this in KernelArgs, and then calculate some for NovaReserved ones and OVS, and now we need to generate systemd unit files to have multiple pagesizes support. Also, this MOP is probably not tested by QE. I believe that operators don't want the hassle of thinking about this as it's prone to human errors. We should probably have some kind of TripleO parameter where operators can define, on a per-role basis, the number of HP they want for each size. From there, we make sure the kernelargs are present, nova has some reserved ones, and that the mounts are present. I guess we could defer this to HardProv if this is something they are interested in implementing, while we could document the process for OSP16/17.


[1]
~~~
(undercloud) [stack@undercloud-0 ~]$ cat 16.2_deployment_files/hugepages.yaml
heat_template_version: queens

description: >
  Hugepages configuration

resources:
  userdata:
    type: OS::Heat::MultipartMime
    properties:
      parts:
      - config: {get_resource: hugepages_config}

  hugepages_config:
    type: OS::Heat::SoftwareConfig
    properties:
      config: |
        #!/bin/bash
        systemctl mask dev-hugepages.mount || true
        for pagesize in 2M 1G;do
          if ! [ -d "/dev/hugepages${pagesize}" ]; then
            mkdir -p "/dev/hugepages${pagesize}"
            cat << EOF > /etc/systemd/system/dev-hugepages${pagesize}.mount
        [Unit]
        Description=${pagesize} Huge Pages File System
        Documentation=https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt
        Documentation=https://www.freedesktop.org/wiki/Software/systemd/APIFileSystems
        DefaultDependencies=no
        Before=sysinit.target
        ConditionPathExists=/sys/kernel/mm/hugepages
        ConditionCapability=CAP_SYS_ADMIN
        ConditionVirtualization=!private-users
        
        [Mount]
        What=hugetlbfs
        Where=/dev/hugepages${pagesize}
        Type=hugetlbfs
        Options=pagesize=${pagesize}
        EOF
          fi
        done
        systemctl daemon-reload
        for pagesize in 2M 1G;do
          systemctl start dev-hugepages${pagesize}.mount
        done

outputs:
  OS::stack_id:
    value: {get_resource: userdata}
(undercloud) [stack@undercloud-0 ~]$ cat 16.2_deployment_files/firstboot.yaml
resource_registry:
  OS::TripleO::ComputeHugepage::NodeUserData: ./hugepages.yaml
~~~

Comment 6 David Vallee Delisle 2022-01-31 18:51:18 UTC
The template shown in comment#5 won't probably work on pre-deployed nodes as it's part of the firstboot. In this case, I guess it's would be the operator's responsibility to create the mounts. We should document this.

Comment 9 OSP Team 2022-09-22 10:41:03 UTC
According to our records, this should be resolved by openstack-tripleo-heat-templates-14.3.1-0.20220719171727.feca772.el9ost.  This build is available now.

Comment 10 OSP Team 2022-09-22 10:41:06 UTC
According to our records, this should be resolved by tripleo-ansible-3.3.1-0.20220720020866.fa5422f.el9ost.  This build is available now.


Note You need to log in before you can comment on or make changes to this bug.