Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 2120392

Summary: [17.0 ga tech preview] Unable to schedule guest with mixed cpu policy and hw.numa_nodes=2
Product: Red Hat OpenStack Reporter: Artom Lifshitz <alifshit>
Component: openstack-novaAssignee: OSP DFG:Compute <osp-dfg-compute>
Status: CLOSED CURRENTRELEASE QA Contact: OSP DFG:Compute <osp-dfg-compute>
Severity: medium Docs Contact:
Priority: medium    
Version: 17.0 (Wallaby)CC: dasmith, eglynn, igallagh, jhakimra, kchamart, sbauza, sgordon, vromanso
Target Milestone: gaKeywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Technology Preview
Doc Text:
In Red Hat OpenStack Platform 17.0, a technology preview is available for creating single NUMA node instances that have both pinned and floating CPUs.
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-09-12 15:21:07 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Artom Lifshitz 2022-08-22 18:51:03 UTC
This bug was initially created as a copy of Bug #2109670

I am copying this bug because: 

We can only support mixed CPU policy in a single-NUMA guest in a Tech Preview fashion.

Description of problem: Unable to schedule guests on SMT based and Non-SMT based computes.  Deployments are setup so that shared pCPUs come from one NUMA and dedicated come from a second NUMA. Scheduling is not an issue if both dedicated and shared come from the same NUMA node:


[heat-admin@compute-0 ~]$ lscpu | grep NUMA
NUMA node(s):                    2
NUMA node0 CPU(s):               0-3
NUMA node1 CPU(s):               4-7
 
[heat-admin@compute-0 ~]$ sudo crudini --get /var/lib/config-data/puppet-generated/nova_libvirt/etc/nova/nova.conf compute cpu_dedicated_set
0-3
[heat-admin@compute-0 ~]$ sudo crudini --get /var/lib/config-data/puppet-generated/nova_libvirt/etc/nova/nova.conf compute cpu_shared_set
4-5
 
[heat-admin@compute-1 ~]$ sudo crudini --get /var/lib/config-data/puppet-generated/nova_libvirt/etc/nova/nova.conf compute cpu_dedicated_set
4,5,6,7
[heat-admin@compute-1 ~]$ sudo crudini --get /var/lib/config-data/puppet-generated/nova_libvirt/etc/nova/nova.conf compute cpu_shared_set
0,1
     
(overcloud) [stack@undercloud-0 ~]$ openstack resource provider list
+--------------------------------------+------------------------+------------+
| uuid                                 | name                   | generation |
+--------------------------------------+------------------------+------------+
| 5f87e1c7-953d-41a4-959d-77b332121378 | compute-0.redhat.local |         39 |
| aa58a24e-4e37-411a-b14f-13f5648a0663 | compute-1.redhat.local |         73 |
+--------------------------------------+------------------------+------------+
(overcloud) [stack@undercloud-0 ~]$ openstack --os-placement-api-version 1.17 resource provider list 5f87e1c7-953d-41a4-959d-77b332121378^C
(overcloud) [stack@undercloud-0 ~]$ openstack --os-placement-api-version 1.17 resource provider inventory list  5f87e1c7-953d-41a4-959d-77b332121378
+----------------+------------------+----------+----------+----------+-----------+-------+
| resource_class | allocation_ratio | min_unit | max_unit | reserved | step_size | total |
+----------------+------------------+----------+----------+----------+-----------+-------+
| MEMORY_MB      |              1.0 |        1 |    31775 |     4096 |         1 | 31775 |
| DISK_GB        |              1.0 |        1 |       24 |        0 |         1 |    24 |
| PCPU           |              1.0 |        1 |        4 |        0 |         1 |     4 |
| VCPU           |             16.0 |        1 |        2 |        0 |         1 |     2 |
+----------------+------------------+----------+----------+----------+-----------+-------+
(overcloud) [stack@undercloud-0 ~]$ openstack --os-placement-api-version 1.17 resource provider inventory list aa58a24e-4e37-411a-b14f-13f5648a0663
+----------------+------------------+----------+----------+----------+-----------+-------+
| resource_class | allocation_ratio | min_unit | max_unit | reserved | step_size | total |
+----------------+------------------+----------+----------+----------+-----------+-------+
| MEMORY_MB      |              1.0 |        1 |    31775 |     4096 |         1 | 31775 |
| DISK_GB        |              1.0 |        1 |       24 |        0 |         1 |    24 |
| PCPU           |              1.0 |        1 |        4 |        0 |         1 |     4 |
| VCPU           |             16.0 |        1 |        2 |        0 |         1 |     2 |
+----------------+------------------+----------+----------+----------+-----------+-------+

# Non-SMT with cpu_thread_policy isolate
(overcloud) [stack@undercloud-0 ~]$ openstack flavor show 181110155
+----------------------------+--------------------------------------------------------------------------------------------------------+
| Field                      | Value                                                                                                  |
+----------------------------+--------------------------------------------------------------------------------------------------------+
| OS-FLV-DISABLED:disabled   | False                                                                                                  |
| OS-FLV-EXT-DATA:ephemeral  | 0                                                                                                      |
| access_project_ids         | None                                                                                                   |
| description                | None                                                                                                   |
| disk                       | 1                                                                                                      |
| id                         | 181110155                                                                                              |
| name                       | tempest-MixedCPUPolicyTest-flavor-1180652834                                                           |
| os-flavor-access:is_public | True                                                                                                   |
| properties                 | hw:cpu_dedicated_mask='^0-1', hw:cpu_policy='mixed', hw:cpu_thread_policy='isolate', hw:numa_nodes='2' |
| ram                        | 64                                                                                                     |
| rxtx_factor                | 1.0                                                                                                    |
| swap                       |                                                                                                        |
| vcpus                      | 4                                                                                                      |
+----------------------------+--------------------------------------------------------------------------------------------------------+


nova-scheduler.log:2022-07-21 17:32:37.122 13 DEBUG nova.virt.hardware [req-d6ddfd2c-9e33-4182-a4eb-a31173482a40 57a9c26a969646d2aa41bd722f607574 f035ce2e4c0a4f07b15c41687ecc7837 - default default] Attempting to fit instance cell InstanceNUMACell(cpu_pinning_raw=None,cpu_policy='mixed',cpu_thread_policy='isolate',cpu_topology=<?>,cpuset=set([0,1]),cpuset_reserved=None,id=0,memory=32,pagesize=None,pcpuset=set([])) on host_cell NUMACell(cpu_usage=0,cpuset=set([0,1]),id=0,memory=15693,memory_usage=0,mempages=[NUMAPagesTopology,NUMAPagesTopology,NUMAPagesTopology],network_metadata=NetworkMetadata,pcpuset=set([]),pinned_cpus=set([]),siblings=[set([0]),set([1])],socket=None) _numa_fit_instance_cell /usr/lib/python3.9/site-packages/nova/virt/hardware.py:929
nova-scheduler.log:2022-07-21 17:32:37.123 13 DEBUG nova.virt.hardware [req-d6ddfd2c-9e33-4182-a4eb-a31173482a40 57a9c26a969646d2aa41bd722f607574 f035ce2e4c0a4f07b15c41687ecc7837 - default default] No specific pagesize requested for instance, selected pagesize: 4 _numa_fit_instance_cell /usr/lib/python3.9/site-packages/nova/virt/hardware.py:956
nova-scheduler.log:2022-07-21 17:32:37.123 13 DEBUG nova.virt.hardware [req-d6ddfd2c-9e33-4182-a4eb-a31173482a40 57a9c26a969646d2aa41bd722f607574 f035ce2e4c0a4f07b15c41687ecc7837 - default default] Instance has requested pinned CPUs _numa_fit_instance_cell /usr/lib/python3.9/site-packages/nova/virt/hardware.py:1021
nova-scheduler.log:2022-07-21 17:32:37.123 13 DEBUG nova.virt.hardware [req-d6ddfd2c-9e33-4182-a4eb-a31173482a40 57a9c26a969646d2aa41bd722f607574 f035ce2e4c0a4f07b15c41687ecc7837 - default default] Packing an instance onto a set of siblings:     host_cell_free_siblings: [set(), set()]    instance_cell: InstanceNUMACell(cpu_pinning_raw=None,cpu_policy='mixed',cpu_thread_policy='isolate',cpu_topology=<?>,cpuset=set([0,1]),cpuset_reserved=None,id=0,memory=32,pagesize=None,pcpuset=set([]))    host_cell_id: 0    threads_per_core: 1    num_cpu_reserved: 0 _pack_instance_onto_cores /usr/lib/python3.9/site-packages/nova/virt/hardware.py:658
nova-scheduler.log:2022-07-21 17:32:37.123 13 DEBUG nova.virt.hardware [req-d6ddfd2c-9e33-4182-a4eb-a31173482a40 57a9c26a969646d2aa41bd722f607574 f035ce2e4c0a4f07b15c41687ecc7837 - default default] Built sibling_sets: defaultdict(<class 'list'>, {}) _pack_instance_onto_cores /usr/lib/python3.9/site-packages/nova/virt/hardware.py:679
nova-scheduler.log:2022-07-21 17:32:37.124 13 DEBUG nova.virt.hardware [req-d6ddfd2c-9e33-4182-a4eb-a31173482a40 57a9c26a969646d2aa41bd722f607574 f035ce2e4c0a4f07b15c41687ecc7837 - default default] Requested 'isolate' thread policy for 2 cores _pack_instance_onto_cores /usr/lib/python3.9/site-packages/nova/virt/hardware.py:791
nova-scheduler.log:2022-07-21 17:32:37.124 13 DEBUG nova.virt.hardware [req-d6ddfd2c-9e33-4182-a4eb-a31173482a40 57a9c26a969646d2aa41bd722f607574 f035ce2e4c0a4f07b15c41687ecc7837 - default default] Host does not have any fully free thread sibling sets.It is not possible to emulate a non-SMT behavior for the isolate policy without this. _pack_instance_onto_cores /usr/lib/python3.9/site-packages/nova/virt/hardware.py:801
nova-scheduler.log:2022-07-21 17:32:37.124 13 DEBUG nova.virt.hardware [req-d6ddfd2c-9e33-4182-a4eb-a31173482a40 57a9c26a969646d2aa41bd722f607574 f035ce2e4c0a4f07b15c41687ecc7837 - default default] Failed to map instance cell CPUs to host cell CPUs _numa_fit_instance_cell /usr/lib/python3.9/site-packages/nova/virt/hardware.py:1049

# Non-SMT with cpu_thread_policy not configured (prefer)
(overcloud) [stack@undercloud-0 ~]$ openstack flavor show 802779656
/usr/lib/python3.9/site-packages/openstack/config/cloud_region.py:452: UserWarning: You have a configured API_VERSION with 'latest' in it. In the context of openstacksdk this doesn't make any sense.
  warnings.warn(
+----------------------------+------------------------------------------------------------------------+
| Field                      | Value                                                                  |
+----------------------------+------------------------------------------------------------------------+
| OS-FLV-DISABLED:disabled   | False                                                                  |
| OS-FLV-EXT-DATA:ephemeral  | 0                                                                      |
| access_project_ids         | None                                                                   |
| description                | None                                                                   |
| disk                       | 1                                                                      |
| id                         | 802779656                                                              |
| name                       | tempest-MixedCPUPolicyTest-flavor-1106554476                           |
| os-flavor-access:is_public | True                                                                   |
| properties                 | hw:cpu_dedicated_mask='^0-1', hw:cpu_policy='mixed', hw:numa_nodes='2' |
| ram                        | 64                                                                     |
| rxtx_factor                | 1.0                                                                    |
| swap                       |                                                                        |
| vcpus                      | 4                                                                      |
+----------------------------+------------------------------------------------------------------------+




/var/log/containers/nova/nova-scheduler.log:2022-07-21 17:35:10.044 16 DEBUG nova.virt.hardware [req-43a35dbc-64ce-4561-93fb-bfba86cee5b0 80e3cf12caee4274be6c0a88b0bd1398 a5d6fa4bc7dc4415b64e7d1b72c0b314 - default default] Attempting to fit instance cell InstanceNUMACell(cpu_pinning_raw=None,cpu_policy='mixed',cpu_thread_policy=None,cpu_topology=<?>,cpuset=set([0,1]),cpuset_reserved=None,id=0,memory=32,pagesize=None,pcpuset=set([])) on host_cell NUMACell(cpu_usage=0,cpuset=set([0,1]),id=0,memory=15693,memory_usage=0,mempages=[NUMAPagesTopology,NUMAPagesTopology,NUMAPagesTopology],network_metadata=NetworkMetadata,pcpuset=set([]),pinned_cpus=set([]),siblings=[set([0]),set([1])],socket=None) _numa_fit_instance_cell /usr/lib/python3.9/site-packages/nova/virt/hardware.py:929
/var/log/containers/nova/nova-scheduler.log:2022-07-21 17:35:10.044 16 DEBUG nova.virt.hardware [req-43a35dbc-64ce-4561-93fb-bfba86cee5b0 80e3cf12caee4274be6c0a88b0bd1398 a5d6fa4bc7dc4415b64e7d1b72c0b314 - default default] No specific pagesize requested for instance, selected pagesize: 4 _numa_fit_instance_cell /usr/lib/python3.9/site-packages/nova/virt/hardware.py:956
/var/log/containers/nova/nova-scheduler.log:2022-07-21 17:35:10.045 16 DEBUG nova.virt.hardware [req-43a35dbc-64ce-4561-93fb-bfba86cee5b0 80e3cf12caee4274be6c0a88b0bd1398 a5d6fa4bc7dc4415b64e7d1b72c0b314 - default default] Instance has requested pinned CPUs _numa_fit_instance_cell /usr/lib/python3.9/site-packages/nova/virt/hardware.py:1021
/var/log/containers/nova/nova-scheduler.log:2022-07-21 17:35:10.045 16 DEBUG nova.virt.hardware [req-43a35dbc-64ce-4561-93fb-bfba86cee5b0 80e3cf12caee4274be6c0a88b0bd1398 a5d6fa4bc7dc4415b64e7d1b72c0b314 - default default] Packing an instance onto a set of siblings:     host_cell_free_siblings: [set(), set()]    instance_cell: InstanceNUMACell(cpu_pinning_raw=None,cpu_policy='mixed',cpu_thread_policy=None,cpu_topology=<?>,cpuset=set([0,1]),cpuset_reserved=None,id=0,memory=32,pagesize=None,pcpuset=set([]))    host_cell_id: 0    threads_per_core: 1    num_cpu_reserved: 0 _pack_instance_onto_cores /usr/lib/python3.9/site-packages/nova/virt/hardware.py:658
/var/log/containers/nova/nova-scheduler.log:2022-07-21 17:35:10.045 16 DEBUG nova.virt.hardware [req-43a35dbc-64ce-4561-93fb-bfba86cee5b0 80e3cf12caee4274be6c0a88b0bd1398 a5d6fa4bc7dc4415b64e7d1b72c0b314 - default default] Built sibling_sets: defaultdict(<class 'list'>, {}) _pack_instance_onto_cores /usr/lib/python3.9/site-packages/nova/virt/hardware.py:679
/var/log/containers/nova/nova-scheduler.log:2022-07-21 17:35:10.045 16 DEBUG nova.virt.hardware [req-43a35dbc-64ce-4561-93fb-bfba86cee5b0 80e3cf12caee4274be6c0a88b0bd1398 a5d6fa4bc7dc4415b64e7d1b72c0b314 - default default] User did not specify a thread policy. Using default for 2 cores _pack_instance_onto_cores /usr/lib/python3.9/site-packages/nova/virt/hardware.py:794
/var/log/containers/nova/nova-scheduler.log:2022-07-21 17:35:10.045 16 INFO nova.virt.hardware [req-43a35dbc-64ce-4561-93fb-bfba86cee5b0 80e3cf12caee4274be6c0a88b0bd1398 a5d6fa4bc7dc4415b64e7d1b72c0b314 - default default] Computed NUMA topology CPU pinning: usable pCPUs: [], vCPUs mapping: []
/var/log/containers/nova/nova-scheduler.log:2022-07-21 17:35:10.046 16 DEBUG nova.virt.hardware [req-43a35dbc-64ce-4561-93fb-bfba86cee5b0 80e3cf12caee4274be6c0a88b0bd1398 a5d6fa4bc7dc4415b64e7d1b72c0b314 - default default] Failed to map instance cell CPUs to host cell CPUs _numa_fit_instance_cell /usr/lib/python3.9/site-packages/nova/virt/hardware.py:1049


# SMT with cpu_thread_policy require
(overcloud) [stack@undercloud-0 ~]$ openstack flavor show 321056094
/usr/lib/python3.9/site-packages/openstack/config/cloud_region.py:452: UserWarning: You have a configured API_VERSION with 'latest' in it. In the context of openstacksdk this doesn't make any sense.
  warnings.warn(
+----------------------------+--------------------------------------------------------------------------------------------------------+
| Field                      | Value                                                                                                  |
+----------------------------+--------------------------------------------------------------------------------------------------------+
| OS-FLV-DISABLED:disabled   | False                                                                                                  |
| OS-FLV-EXT-DATA:ephemeral  | 0                                                                                                      |
| access_project_ids         | None                                                                                                   |
| description                | None                                                                                                   |
| disk                       | 3                                                                                                      |
| id                         | 321056094                                                                                              |
| name                       | tempest-MixedCPUPolicyTest-flavor-1514426045                                                           |
| os-flavor-access:is_public | True                                                                                                   |
| properties                 | hw:cpu_dedicated_mask='^0-1', hw:cpu_policy='mixed', hw:cpu_thread_policy='require', hw:numa_nodes='2' |
| ram                        | 64                                                                                                     |
| rxtx_factor                | 1.0                                                                                                    |
| swap                       |                                                                                                        |
| vcpus                      | 4                                                                                                      |
+----------------------------+--------------------------------------------------------------------------------------------------------+


nova/nova-scheduler.log:2022-07-21 17:39:41.849 15 DEBUG nova.virt.hardware [req-58c3c935-85c0-40c6-921f-2b5272b815cf bb512fcd84cb49b0a3e2fa4218435832 451c9bf68e374924b4457241d3f1630b - default default] Attempting to fit instance cell InstanceNUMACell(cpu_pinning_raw=None,cpu_policy='mixed',cpu_thread_policy='require',cpu_topology=<?>,cpuset=set([0,1]),cpuset_reserved=None,id=0,memory=32,pagesize=None,pcpuset=set([])) on host_cell NUMACell(cpu_usage=0,cpuset=set([]),id=1,memory=16118,memory_usage=0,mempages=[NUMAPagesTopology,NUMAPagesTopology,NUMAPagesTopology],network_metadata=NetworkMetadata,pcpuset=set([4,5,6,7]),pinned_cpus=set([]),siblings=[set([6,7]),set([4,5])],socket=None) _numa_fit_instance_cell /usr/lib/python3.9/site-packages/nova/virt/hardware.py:929
nova/nova-scheduler.log:2022-07-21 17:39:41.849 15 DEBUG nova.virt.hardware [req-58c3c935-85c0-40c6-921f-2b5272b815cf bb512fcd84cb49b0a3e2fa4218435832 451c9bf68e374924b4457241d3f1630b - default default] No specific pagesize requested for instance, selected pagesize: 4 _numa_fit_instance_cell /usr/lib/python3.9/site-packages/nova/virt/hardware.py:956
nova/nova-scheduler.log:2022-07-21 17:39:41.849 15 DEBUG nova.virt.hardware [req-58c3c935-85c0-40c6-921f-2b5272b815cf bb512fcd84cb49b0a3e2fa4218435832 451c9bf68e374924b4457241d3f1630b - default default] Not enough host cell CPUs to fit instance cell; required: 2, actual: 0 _numa_fit_instance_cell /usr/lib/python3.9/site-packages/nova/virt/hardware.py:1010
nova/nova-scheduler.log:2022-07-21 17:39:41.849 15 DEBUG nova.scheduler.filters.numa_topology_filter [req-58c3c935-85c0-40c6-921f-2b5272b815cf bb512fcd84cb49b0a3e2fa4218435832 451c9bf68e374924b4457241d3f1630b - default default] [instance: af56faf4-9dc2-4c64-9099-19223fcc6fa4] compute-1.redhat.local, compute-1.redhat.local fails NUMA topology requirements. The instance does not fit on this host. host_passes /usr/lib/python3.9/site-packages/nova/scheduler/filters/numa_topology_filter.py:106



Version-Release number of selected component (if applicable):
RHOS-17

How reproducible:
100%

Steps to Reproduce:
1.  Deploy a guest where the pool of shared and dedicated pCPUs come from different NUMA nodes
2.
3.

Actual results:
Guest fails to deploy

Expected results:
Guest deploys with vCPUs pinned to a shared set from one NUMA node and vCPUs pinned to dedicated pCPUs from another NUMA node

Additional info:

Comment 1 Artom Lifshitz 2022-09-12 15:21:07 UTC
The release notes automation will pick up BZs with the right flags even if they're closed. With the doc text done and requires_doc_text set to +, we can close this.