Description of problem: Currently it is possible to configure multiple hugepage sizes in openstack per each compute via the THT parameter KernelArgs e.g.: KernelArgs: "default_hugepagesz=1G hugepagesz=1G hugepages=4 hugepagesz=2M hugepages=1024 iommu=pt intel_iommu=on" While this results in 1G and 2M both being configured on the compute, which will be reported to libvirt, the only mount setup will be for the default hugepages. [root@computeamdsev-0 openstack]# mount | grep hugepages hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,seclabel,pagesize=1024M) Without another mount for the non-default hugepages it is only possible to to schedule guests with 1G hugepages in their respective flavor. This type of deployment could benefit ovs-dpdk environments where operators want to deploy guests with a different hugepage size in the flavor versus the hugepage size allocated to ovs-dpdk.
So as far as doc is concerned, this is how it can be done [1] using firstboot. I still believe the HP process could benefit from some automation because we can qualify the current process, at best, as being hacky. I mean, we need to define this in KernelArgs, and then calculate some for NovaReserved ones and OVS, and now we need to generate systemd unit files to have multiple pagesizes support. Also, this MOP is probably not tested by QE. I believe that operators don't want the hassle of thinking about this as it's prone to human errors. We should probably have some kind of TripleO parameter where operators can define, on a per-role basis, the number of HP they want for each size. From there, we make sure the kernelargs are present, nova has some reserved ones, and that the mounts are present. I guess we could defer this to HardProv if this is something they are interested in implementing, while we could document the process for OSP16/17. [1] ~~~ (undercloud) [stack@undercloud-0 ~]$ cat 16.2_deployment_files/hugepages.yaml heat_template_version: queens description: > Hugepages configuration resources: userdata: type: OS::Heat::MultipartMime properties: parts: - config: {get_resource: hugepages_config} hugepages_config: type: OS::Heat::SoftwareConfig properties: config: | #!/bin/bash systemctl mask dev-hugepages.mount || true for pagesize in 2M 1G;do if ! [ -d "/dev/hugepages${pagesize}" ]; then mkdir -p "/dev/hugepages${pagesize}" cat << EOF > /etc/systemd/system/dev-hugepages${pagesize}.mount [Unit] Description=${pagesize} Huge Pages File System Documentation=https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt Documentation=https://www.freedesktop.org/wiki/Software/systemd/APIFileSystems DefaultDependencies=no Before=sysinit.target ConditionPathExists=/sys/kernel/mm/hugepages ConditionCapability=CAP_SYS_ADMIN ConditionVirtualization=!private-users [Mount] What=hugetlbfs Where=/dev/hugepages${pagesize} Type=hugetlbfs Options=pagesize=${pagesize} EOF fi done systemctl daemon-reload for pagesize in 2M 1G;do systemctl start dev-hugepages${pagesize}.mount done outputs: OS::stack_id: value: {get_resource: userdata} (undercloud) [stack@undercloud-0 ~]$ cat 16.2_deployment_files/firstboot.yaml resource_registry: OS::TripleO::ComputeHugepage::NodeUserData: ./hugepages.yaml ~~~
The template shown in comment#5 won't probably work on pre-deployed nodes as it's part of the firstboot. In this case, I guess it's would be the operator's responsibility to create the mounts. We should document this.
According to our records, this should be resolved by openstack-tripleo-heat-templates-14.3.1-0.20220719171727.feca772.el9ost. This build is available now.
According to our records, this should be resolved by tripleo-ansible-3.3.1-0.20220720020866.fa5422f.el9ost. This build is available now.