this is pretty valid to me.
It looks like multi numa and some edge cases related to hypterthreading extra were not tested as part of implementing mix cpu support.
I have not tried but we should be able to reproduce this fairly simply in the nova functional tests.
we currently have minimal testing of mixed cpus there today and it should be simple enough to extend that since all the infra is in place.
This will need a release note (note sure if enhancement or bug fix) to move this out of tech preview, since we have https://bugzilla.redhat.com/show_bug.cgi?id=2120392 tracking the tech preview release note.
Description of problem: Unable to schedule guests on SMT based and Non-SMT based computes. Deployments are setup so that shared pCPUs come from one NUMA and dedicated come from a second NUMA. Scheduling is not an issue if both dedicated and shared come from the same NUMA node: [heat-admin@compute-0 ~]$ lscpu | grep NUMA NUMA node(s): 2 NUMA node0 CPU(s): 0-3 NUMA node1 CPU(s): 4-7 [heat-admin@compute-0 ~]$ sudo crudini --get /var/lib/config-data/puppet-generated/nova_libvirt/etc/nova/nova.conf compute cpu_dedicated_set 0-3 [heat-admin@compute-0 ~]$ sudo crudini --get /var/lib/config-data/puppet-generated/nova_libvirt/etc/nova/nova.conf compute cpu_shared_set 4-5 [heat-admin@compute-1 ~]$ sudo crudini --get /var/lib/config-data/puppet-generated/nova_libvirt/etc/nova/nova.conf compute cpu_dedicated_set 4,5,6,7 [heat-admin@compute-1 ~]$ sudo crudini --get /var/lib/config-data/puppet-generated/nova_libvirt/etc/nova/nova.conf compute cpu_shared_set 0,1 (overcloud) [stack@undercloud-0 ~]$ openstack resource provider list +--------------------------------------+------------------------+------------+ | uuid | name | generation | +--------------------------------------+------------------------+------------+ | 5f87e1c7-953d-41a4-959d-77b332121378 | compute-0.redhat.local | 39 | | aa58a24e-4e37-411a-b14f-13f5648a0663 | compute-1.redhat.local | 73 | +--------------------------------------+------------------------+------------+ (overcloud) [stack@undercloud-0 ~]$ openstack --os-placement-api-version 1.17 resource provider list 5f87e1c7-953d-41a4-959d-77b332121378^C (overcloud) [stack@undercloud-0 ~]$ openstack --os-placement-api-version 1.17 resource provider inventory list 5f87e1c7-953d-41a4-959d-77b332121378 +----------------+------------------+----------+----------+----------+-----------+-------+ | resource_class | allocation_ratio | min_unit | max_unit | reserved | step_size | total | +----------------+------------------+----------+----------+----------+-----------+-------+ | MEMORY_MB | 1.0 | 1 | 31775 | 4096 | 1 | 31775 | | DISK_GB | 1.0 | 1 | 24 | 0 | 1 | 24 | | PCPU | 1.0 | 1 | 4 | 0 | 1 | 4 | | VCPU | 16.0 | 1 | 2 | 0 | 1 | 2 | +----------------+------------------+----------+----------+----------+-----------+-------+ (overcloud) [stack@undercloud-0 ~]$ openstack --os-placement-api-version 1.17 resource provider inventory list aa58a24e-4e37-411a-b14f-13f5648a0663 +----------------+------------------+----------+----------+----------+-----------+-------+ | resource_class | allocation_ratio | min_unit | max_unit | reserved | step_size | total | +----------------+------------------+----------+----------+----------+-----------+-------+ | MEMORY_MB | 1.0 | 1 | 31775 | 4096 | 1 | 31775 | | DISK_GB | 1.0 | 1 | 24 | 0 | 1 | 24 | | PCPU | 1.0 | 1 | 4 | 0 | 1 | 4 | | VCPU | 16.0 | 1 | 2 | 0 | 1 | 2 | +----------------+------------------+----------+----------+----------+-----------+-------+ # Non-SMT with cpu_thread_policy isolate (overcloud) [stack@undercloud-0 ~]$ openstack flavor show 181110155 +----------------------------+--------------------------------------------------------------------------------------------------------+ | Field | Value | +----------------------------+--------------------------------------------------------------------------------------------------------+ | OS-FLV-DISABLED:disabled | False | | OS-FLV-EXT-DATA:ephemeral | 0 | | access_project_ids | None | | description | None | | disk | 1 | | id | 181110155 | | name | tempest-MixedCPUPolicyTest-flavor-1180652834 | | os-flavor-access:is_public | True | | properties | hw:cpu_dedicated_mask='^0-1', hw:cpu_policy='mixed', hw:cpu_thread_policy='isolate', hw:numa_nodes='2' | | ram | 64 | | rxtx_factor | 1.0 | | swap | | | vcpus | 4 | +----------------------------+--------------------------------------------------------------------------------------------------------+ nova-scheduler.log:2022-07-21 17:32:37.122 13 DEBUG nova.virt.hardware [req-d6ddfd2c-9e33-4182-a4eb-a31173482a40 57a9c26a969646d2aa41bd722f607574 f035ce2e4c0a4f07b15c41687ecc7837 - default default] Attempting to fit instance cell InstanceNUMACell(cpu_pinning_raw=None,cpu_policy='mixed',cpu_thread_policy='isolate',cpu_topology=<?>,cpuset=set([0,1]),cpuset_reserved=None,id=0,memory=32,pagesize=None,pcpuset=set([])) on host_cell NUMACell(cpu_usage=0,cpuset=set([0,1]),id=0,memory=15693,memory_usage=0,mempages=[NUMAPagesTopology,NUMAPagesTopology,NUMAPagesTopology],network_metadata=NetworkMetadata,pcpuset=set([]),pinned_cpus=set([]),siblings=[set([0]),set([1])],socket=None) _numa_fit_instance_cell /usr/lib/python3.9/site-packages/nova/virt/hardware.py:929 nova-scheduler.log:2022-07-21 17:32:37.123 13 DEBUG nova.virt.hardware [req-d6ddfd2c-9e33-4182-a4eb-a31173482a40 57a9c26a969646d2aa41bd722f607574 f035ce2e4c0a4f07b15c41687ecc7837 - default default] No specific pagesize requested for instance, selected pagesize: 4 _numa_fit_instance_cell /usr/lib/python3.9/site-packages/nova/virt/hardware.py:956 nova-scheduler.log:2022-07-21 17:32:37.123 13 DEBUG nova.virt.hardware [req-d6ddfd2c-9e33-4182-a4eb-a31173482a40 57a9c26a969646d2aa41bd722f607574 f035ce2e4c0a4f07b15c41687ecc7837 - default default] Instance has requested pinned CPUs _numa_fit_instance_cell /usr/lib/python3.9/site-packages/nova/virt/hardware.py:1021 nova-scheduler.log:2022-07-21 17:32:37.123 13 DEBUG nova.virt.hardware [req-d6ddfd2c-9e33-4182-a4eb-a31173482a40 57a9c26a969646d2aa41bd722f607574 f035ce2e4c0a4f07b15c41687ecc7837 - default default] Packing an instance onto a set of siblings: host_cell_free_siblings: [set(), set()] instance_cell: InstanceNUMACell(cpu_pinning_raw=None,cpu_policy='mixed',cpu_thread_policy='isolate',cpu_topology=<?>,cpuset=set([0,1]),cpuset_reserved=None,id=0,memory=32,pagesize=None,pcpuset=set([])) host_cell_id: 0 threads_per_core: 1 num_cpu_reserved: 0 _pack_instance_onto_cores /usr/lib/python3.9/site-packages/nova/virt/hardware.py:658 nova-scheduler.log:2022-07-21 17:32:37.123 13 DEBUG nova.virt.hardware [req-d6ddfd2c-9e33-4182-a4eb-a31173482a40 57a9c26a969646d2aa41bd722f607574 f035ce2e4c0a4f07b15c41687ecc7837 - default default] Built sibling_sets: defaultdict(<class 'list'>, {}) _pack_instance_onto_cores /usr/lib/python3.9/site-packages/nova/virt/hardware.py:679 nova-scheduler.log:2022-07-21 17:32:37.124 13 DEBUG nova.virt.hardware [req-d6ddfd2c-9e33-4182-a4eb-a31173482a40 57a9c26a969646d2aa41bd722f607574 f035ce2e4c0a4f07b15c41687ecc7837 - default default] Requested 'isolate' thread policy for 2 cores _pack_instance_onto_cores /usr/lib/python3.9/site-packages/nova/virt/hardware.py:791 nova-scheduler.log:2022-07-21 17:32:37.124 13 DEBUG nova.virt.hardware [req-d6ddfd2c-9e33-4182-a4eb-a31173482a40 57a9c26a969646d2aa41bd722f607574 f035ce2e4c0a4f07b15c41687ecc7837 - default default] Host does not have any fully free thread sibling sets.It is not possible to emulate a non-SMT behavior for the isolate policy without this. _pack_instance_onto_cores /usr/lib/python3.9/site-packages/nova/virt/hardware.py:801 nova-scheduler.log:2022-07-21 17:32:37.124 13 DEBUG nova.virt.hardware [req-d6ddfd2c-9e33-4182-a4eb-a31173482a40 57a9c26a969646d2aa41bd722f607574 f035ce2e4c0a4f07b15c41687ecc7837 - default default] Failed to map instance cell CPUs to host cell CPUs _numa_fit_instance_cell /usr/lib/python3.9/site-packages/nova/virt/hardware.py:1049 # Non-SMT with cpu_thread_policy not configured (prefer) (overcloud) [stack@undercloud-0 ~]$ openstack flavor show 802779656 /usr/lib/python3.9/site-packages/openstack/config/cloud_region.py:452: UserWarning: You have a configured API_VERSION with 'latest' in it. In the context of openstacksdk this doesn't make any sense. warnings.warn( +----------------------------+------------------------------------------------------------------------+ | Field | Value | +----------------------------+------------------------------------------------------------------------+ | OS-FLV-DISABLED:disabled | False | | OS-FLV-EXT-DATA:ephemeral | 0 | | access_project_ids | None | | description | None | | disk | 1 | | id | 802779656 | | name | tempest-MixedCPUPolicyTest-flavor-1106554476 | | os-flavor-access:is_public | True | | properties | hw:cpu_dedicated_mask='^0-1', hw:cpu_policy='mixed', hw:numa_nodes='2' | | ram | 64 | | rxtx_factor | 1.0 | | swap | | | vcpus | 4 | +----------------------------+------------------------------------------------------------------------+ /var/log/containers/nova/nova-scheduler.log:2022-07-21 17:35:10.044 16 DEBUG nova.virt.hardware [req-43a35dbc-64ce-4561-93fb-bfba86cee5b0 80e3cf12caee4274be6c0a88b0bd1398 a5d6fa4bc7dc4415b64e7d1b72c0b314 - default default] Attempting to fit instance cell InstanceNUMACell(cpu_pinning_raw=None,cpu_policy='mixed',cpu_thread_policy=None,cpu_topology=<?>,cpuset=set([0,1]),cpuset_reserved=None,id=0,memory=32,pagesize=None,pcpuset=set([])) on host_cell NUMACell(cpu_usage=0,cpuset=set([0,1]),id=0,memory=15693,memory_usage=0,mempages=[NUMAPagesTopology,NUMAPagesTopology,NUMAPagesTopology],network_metadata=NetworkMetadata,pcpuset=set([]),pinned_cpus=set([]),siblings=[set([0]),set([1])],socket=None) _numa_fit_instance_cell /usr/lib/python3.9/site-packages/nova/virt/hardware.py:929 /var/log/containers/nova/nova-scheduler.log:2022-07-21 17:35:10.044 16 DEBUG nova.virt.hardware [req-43a35dbc-64ce-4561-93fb-bfba86cee5b0 80e3cf12caee4274be6c0a88b0bd1398 a5d6fa4bc7dc4415b64e7d1b72c0b314 - default default] No specific pagesize requested for instance, selected pagesize: 4 _numa_fit_instance_cell /usr/lib/python3.9/site-packages/nova/virt/hardware.py:956 /var/log/containers/nova/nova-scheduler.log:2022-07-21 17:35:10.045 16 DEBUG nova.virt.hardware [req-43a35dbc-64ce-4561-93fb-bfba86cee5b0 80e3cf12caee4274be6c0a88b0bd1398 a5d6fa4bc7dc4415b64e7d1b72c0b314 - default default] Instance has requested pinned CPUs _numa_fit_instance_cell /usr/lib/python3.9/site-packages/nova/virt/hardware.py:1021 /var/log/containers/nova/nova-scheduler.log:2022-07-21 17:35:10.045 16 DEBUG nova.virt.hardware [req-43a35dbc-64ce-4561-93fb-bfba86cee5b0 80e3cf12caee4274be6c0a88b0bd1398 a5d6fa4bc7dc4415b64e7d1b72c0b314 - default default] Packing an instance onto a set of siblings: host_cell_free_siblings: [set(), set()] instance_cell: InstanceNUMACell(cpu_pinning_raw=None,cpu_policy='mixed',cpu_thread_policy=None,cpu_topology=<?>,cpuset=set([0,1]),cpuset_reserved=None,id=0,memory=32,pagesize=None,pcpuset=set([])) host_cell_id: 0 threads_per_core: 1 num_cpu_reserved: 0 _pack_instance_onto_cores /usr/lib/python3.9/site-packages/nova/virt/hardware.py:658 /var/log/containers/nova/nova-scheduler.log:2022-07-21 17:35:10.045 16 DEBUG nova.virt.hardware [req-43a35dbc-64ce-4561-93fb-bfba86cee5b0 80e3cf12caee4274be6c0a88b0bd1398 a5d6fa4bc7dc4415b64e7d1b72c0b314 - default default] Built sibling_sets: defaultdict(<class 'list'>, {}) _pack_instance_onto_cores /usr/lib/python3.9/site-packages/nova/virt/hardware.py:679 /var/log/containers/nova/nova-scheduler.log:2022-07-21 17:35:10.045 16 DEBUG nova.virt.hardware [req-43a35dbc-64ce-4561-93fb-bfba86cee5b0 80e3cf12caee4274be6c0a88b0bd1398 a5d6fa4bc7dc4415b64e7d1b72c0b314 - default default] User did not specify a thread policy. Using default for 2 cores _pack_instance_onto_cores /usr/lib/python3.9/site-packages/nova/virt/hardware.py:794 /var/log/containers/nova/nova-scheduler.log:2022-07-21 17:35:10.045 16 INFO nova.virt.hardware [req-43a35dbc-64ce-4561-93fb-bfba86cee5b0 80e3cf12caee4274be6c0a88b0bd1398 a5d6fa4bc7dc4415b64e7d1b72c0b314 - default default] Computed NUMA topology CPU pinning: usable pCPUs: [], vCPUs mapping: [] /var/log/containers/nova/nova-scheduler.log:2022-07-21 17:35:10.046 16 DEBUG nova.virt.hardware [req-43a35dbc-64ce-4561-93fb-bfba86cee5b0 80e3cf12caee4274be6c0a88b0bd1398 a5d6fa4bc7dc4415b64e7d1b72c0b314 - default default] Failed to map instance cell CPUs to host cell CPUs _numa_fit_instance_cell /usr/lib/python3.9/site-packages/nova/virt/hardware.py:1049 # SMT with cpu_thread_policy require (overcloud) [stack@undercloud-0 ~]$ openstack flavor show 321056094 /usr/lib/python3.9/site-packages/openstack/config/cloud_region.py:452: UserWarning: You have a configured API_VERSION with 'latest' in it. In the context of openstacksdk this doesn't make any sense. warnings.warn( +----------------------------+--------------------------------------------------------------------------------------------------------+ | Field | Value | +----------------------------+--------------------------------------------------------------------------------------------------------+ | OS-FLV-DISABLED:disabled | False | | OS-FLV-EXT-DATA:ephemeral | 0 | | access_project_ids | None | | description | None | | disk | 3 | | id | 321056094 | | name | tempest-MixedCPUPolicyTest-flavor-1514426045 | | os-flavor-access:is_public | True | | properties | hw:cpu_dedicated_mask='^0-1', hw:cpu_policy='mixed', hw:cpu_thread_policy='require', hw:numa_nodes='2' | | ram | 64 | | rxtx_factor | 1.0 | | swap | | | vcpus | 4 | +----------------------------+--------------------------------------------------------------------------------------------------------+ nova/nova-scheduler.log:2022-07-21 17:39:41.849 15 DEBUG nova.virt.hardware [req-58c3c935-85c0-40c6-921f-2b5272b815cf bb512fcd84cb49b0a3e2fa4218435832 451c9bf68e374924b4457241d3f1630b - default default] Attempting to fit instance cell InstanceNUMACell(cpu_pinning_raw=None,cpu_policy='mixed',cpu_thread_policy='require',cpu_topology=<?>,cpuset=set([0,1]),cpuset_reserved=None,id=0,memory=32,pagesize=None,pcpuset=set([])) on host_cell NUMACell(cpu_usage=0,cpuset=set([]),id=1,memory=16118,memory_usage=0,mempages=[NUMAPagesTopology,NUMAPagesTopology,NUMAPagesTopology],network_metadata=NetworkMetadata,pcpuset=set([4,5,6,7]),pinned_cpus=set([]),siblings=[set([6,7]),set([4,5])],socket=None) _numa_fit_instance_cell /usr/lib/python3.9/site-packages/nova/virt/hardware.py:929 nova/nova-scheduler.log:2022-07-21 17:39:41.849 15 DEBUG nova.virt.hardware [req-58c3c935-85c0-40c6-921f-2b5272b815cf bb512fcd84cb49b0a3e2fa4218435832 451c9bf68e374924b4457241d3f1630b - default default] No specific pagesize requested for instance, selected pagesize: 4 _numa_fit_instance_cell /usr/lib/python3.9/site-packages/nova/virt/hardware.py:956 nova/nova-scheduler.log:2022-07-21 17:39:41.849 15 DEBUG nova.virt.hardware [req-58c3c935-85c0-40c6-921f-2b5272b815cf bb512fcd84cb49b0a3e2fa4218435832 451c9bf68e374924b4457241d3f1630b - default default] Not enough host cell CPUs to fit instance cell; required: 2, actual: 0 _numa_fit_instance_cell /usr/lib/python3.9/site-packages/nova/virt/hardware.py:1010 nova/nova-scheduler.log:2022-07-21 17:39:41.849 15 DEBUG nova.scheduler.filters.numa_topology_filter [req-58c3c935-85c0-40c6-921f-2b5272b815cf bb512fcd84cb49b0a3e2fa4218435832 451c9bf68e374924b4457241d3f1630b - default default] [instance: af56faf4-9dc2-4c64-9099-19223fcc6fa4] compute-1.redhat.local, compute-1.redhat.local fails NUMA topology requirements. The instance does not fit on this host. host_passes /usr/lib/python3.9/site-packages/nova/scheduler/filters/numa_topology_filter.py:106 Version-Release number of selected component (if applicable): RHOS-17 How reproducible: 100% Steps to Reproduce: 1. Deploy a guest where the pool of shared and dedicated pCPUs come from different NUMA nodes 2. 3. Actual results: Guest fails to deploy Expected results: Guest deploys with vCPUs pinned to a shared set from one NUMA node and vCPUs pinned to dedicated pCPUs from another NUMA node Additional info: