Description of problem: Failed to create VM with flavor settings-(cpu_period cpu_quota cpu_shares), when kernel support cpu controller Version-Release number of selected component (if applicable): openstack-nova-compute-23.2.2-0.20220720130412.7074ac0.el9ost.noarch libvirt-daemon-driver-qemu-8.0.0-8.1.el9_0.x86_64 kernel-5.14.0-70.17.1.el9_0.x86_64 How reproducible: 100% Steps to Reproduce: 1. Setup the OSP17 & RHEL9.0 env by running the job: custom-17.0_compact-director-rhel-9.0-virthost-3cont_2comp_3ceph-ipv4-geneve-ceph #7 2. Create the image, network and flavor with cpu quota settings # openstack flavor create flavor_cpu --id 101 --ram 2048 --disk 10 --vcpus 2 # openstack flavor set flavor_cpu --property hw:boot_menu='true' --property quota:cpu_period='1000000' --property quota:cpu_quota='1000000000' --property quota:cpu_shares='2048' # openstack network create asb-net1 # openstack subnet create subasb-net1 --network asb-net1 --subnet-range 192.168.32.0/22 # openstack image create asb-qcow2 --disk-format qcow2 --container-format bare --file /tmp/RHEL-9.0.0-20220429.1-x86_64.qcow2 [stack@undercloud-0 ~]$ openstack flavor list +--------------------------------------+------------+------+------+-----------+- | ID | Name | RAM | Disk | Ephemeral | VCPUs | Is Public | +--------------------------------------+------------+------+------+-----------+- | 100 | asb-m2 | 512 | 10 | 0 | 1 | True | | 101 | flavor_cpu | 2048 | 10 | 0 | 2 | True | [stack@undercloud-0 ~]$ openstack image list +--------------------------------------+----------------------------------+----- | ID | Name | Status | +--------------------------------------+----------------------------------+----- | 28482783-206d-4e7b-8fe3-07eef56d447c | asb-qcow2 | active | 3. Try to create VM, hit error: "Requested CPU control policy not supported by host" [stack@undercloud-0 ~]$ openstack server create --flavor flavor_cpu --image asb-qcow2 --nic net-id=a0f5494e-d027-48f4-84b4-3de0ecfe402a --availability-zone nova:compute-0.redhat.local vm-r9-qcow2 [stack@undercloud-0 ~]$ openstack server list +--------------------------------------+------------------+--------+-------------------------+--------------------------+------------+ | ID | Name | Status | Networks | Image | Flavor | +--------------------------------------+------------------+--------+-------------------------+--------------------------+------------+ | 7e752bb9-6793-42fa-b641-6a153fc292a6 | vm-r9-qcow2 | ERROR | | asb-qcow2 | flavor_cpu | [stack@undercloud-0 ~]$ openstack server show vm-r9-qcow2 ...... | fault | {'code': 500, 'created': '2022-08-17T08:24:38Z', 'message': 'Requested CPU control policy not supported by host', 'details': 'Traceback (most recent call last):\n File "/usr/lib/python3.9/site-packages/nova/compute/manager.py", line 2409, in _build_and_run_instance\n self.driver.spawn(context, instance, image_meta,\n File "/usr/lib/python3.9/site-packages/nova/virt/libvirt/driver.py", line 4189, in spawn\n xml = self._get_guest_xml(context, instance, network_info,\n File "/usr/lib/python3.9/site-packages/nova/virt/libvirt/driver.py", line 7038, in _get_guest_xml\n conf = self._get_guest_config(instance, network_info, image_meta,\n File "/usr/lib/python3.9/site-packages/nova/virt/libvirt/driver.py", line 6627, in _get_guest_config\n self._update_guest_cputune(guest, flavor)\n File "/usr/lib/python3.9/site-packages/nova/virt/libvirt/driver.py", line 5490, in _update_guest_cputune\n raise exception.UnsupportedHostCPUControlPolicy()\nnova.exception.UnsupportedHostCPUControlPolicy: Requested CPU control policy not supported by host\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n File "/usr/lib/python3.9/site-packages/nova/compute/manager.py", line 2232, in _do_build_and_run_instance\n self._build_and_run_instance(context, instance, image,\n File "/usr/lib/python3.9/site-packages/nova/compute/manager.py", line 2505, in _build_and_run_instance\n raise exception.RescheduledException(\nnova.exception.RescheduledException: Build of instance 7e752bb9-6793-42fa-b641-6a153fc292a6 was re-scheduled: Requested CPU control policy not supported by host\n'} | | flavor | disk='10', ephemeral='0', extra_specs.hw:boot_menu='true', extra_specs.quota:cpu_period='1000000', extra_specs.quota:cpu_quota='1000000000', extra_specs.quota:cpu_shares='2048', original_name='flavor_cpu', ram='2048', swap='0', vcpus='2' 4. Create VM with another flavor without the cpu settings:"--property quota:cpu_period='1000000' --property quota:cpu_quota='1000000000' --property quota:cpu_shares='2048'", VM is created successfully 5. Check the nova code, it checks "cpu" in "/proc/self/mounts". https://github.com/openstack/nova/blob/0b0fa8ac315ed497abfa4248ba5d8b0bb145d9b3/nova/virt/libvirt/driver.py line: 5687-5695 ------------------------------------------------------------------ def _update_guest_cputune(self, guest, flavor): is_able = self._host.is_cpu_control_policy_capable() cputuning = ['shares', 'period', 'quota'] wants_cputune = any([k for k in cputuning if "quota:cpu_" + k in flavor.extra_specs.keys()]) if wants_cputune and not is_able: raise exception.UnsupportedHostCPUControlPolicy() ------------------------------------------------------------------ https://github.com/openstack/nova/blob/0b0fa8ac315ed497abfa4248ba5d8b0bb145d9b3/nova/virt/libvirt/host.py#L1608 line: 1608-1623 -------------------------------------------------------------- def is_cpu_control_policy_capable(self): """Returns whether kernel configuration CGROUP_SCHED is enabled CONFIG_CGROUP_SCHED may be disabled in some kernel configs to improve scheduler latency. """ try: with open("/proc/self/mounts", "r") as fd: -> It checks the /proc/self/mounts for line in fd.readlines(): # mount options and split options bits = line.split()[3].split(",") if "cpu" in bits: return True return False except IOError: return False -------------------------------------------------------------- 6. But for RHEL9, it use cgroup v2 by default. We need to check "cpu" in /sys/fs/cgroup/cgroup.controllers. The current compute node supports cpu controller. Thus, we may need to change the code in "nova/virt/libvirt/host.py" --------------------------------------------------------------------- [heat-admin@compute-0 ~]$ cat /sys/fs/cgroup/cgroup.controllers | grep cpu cpuset cpu io memory hugetlb pids rdma misc [heat-admin@compute-0 ~]$ cat /sys/fs/cgroup/machine.slice/cgroup.controllers | grep cpu cpuset cpu io memory hugetlb pids [heat-admin@compute-0 ~]$ cat /proc/self/mounts| grep cpu -> This way is for checking with cgroupv1 No output ------------------------------------------------------------------- Actual results: Failed to create guest with flavor settings-(cpu_period cpu_quota cpu_shares), when kernel support cpu controller Expected results: Create guest successfully with flavor settings-(cpu_period cpu_quota cpu_shares), when kernel support cpu controller Additional info: Bug1513930 - RFE: rewrite cgroups code to support v2 subsystem
On the master branch, codes are in the same lines. https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py line 5687-5695 https://github.com/openstack/nova/blob/master/nova/virt/libvirt/host.py line: 1608-1623
I'm removing the Triaged keyword so that we can have a discussion about this. 17.0.1 is now blockers-only, and we're not sure if there's going to be a 17.0.2. Since this is a regression, we want to decide whether it's had enough that we ask for a bocker flag on this. In all cases, we'll copy this to 17.1 so that we can fix it there.
Conclusion: 1. File a known issue for 17.0. 2. Fix only the host support detection in 17.1. 3. Document that the values are host and virt driver dependant, and if you're upgrading to 17.1 you need to: a. Make sure the values in your extra specs are supported by cgroups v2 on RHEL 9. b. Create new flavors and resize your instances if they're not.
> 1. File a known issue for 17.0. https://bugzilla.redhat.com/show_bug.cgi?id=2153815 > 2. Fix only the host support detection in 17.1. Re-targetted this BZ to 17.1. > 3. Document that the values are host and virt driver dependant, and if > you're upgrading to 17.1 you need to: > a. Make sure the values in your extra specs are supported by cgroups v2 > on RHEL 9. > b. Create new flavors and resize your instances if they're not. Added a note to the same BZ (https://bugzilla.redhat.com/show_bug.cgi?id=2153815)
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Release of components for Red Hat OpenStack Platform 17.1 (Wallaby)), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2023:4577