Bug 2107306

Summary: _numa_cells_support_network_metadata does not log any output even at debug
Product: Red Hat OpenStack Reporter: Jean-Francois Beaudoin <jbeaudoi>
Component: openstack-novaAssignee: melanie witt <mwitt>
Status: ON_DEV --- QA Contact: OSP DFG:Compute <osp-dfg-compute>
Severity: low Docs Contact:
Priority: low    
Version: 16.2 (Train)CC: alifshit, dasmith, eglynn, eolivare, hasingh, jhakimra, kchamart, mwitt, sbauza, sgordon, smooney, vromanso
Target Milestone: z2Keywords: Patch, Triaged
Target Release: 17.1Flags: ifrangs: needinfo? (mwitt)
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jean-Francois Beaudoin 2022-07-14 17:36:20 UTC
Description of problem:
Instance creation fails onto NUMATopologyFilter when it seems there's at least 1 numa node with enough ressources.


Version-Release number of selected component (if applicable):
Red Hat OpenStack Platform release 16.2.2 (Train)

How reproducible:
Every time we try to spawn a instance using this flavor.

Steps to Reproduce:
1. Try to create a VM using the flavor.
2.
3.

Actual results:
Being able to create a VM within a numa node with available ressource.

Expected results:
Creation gets block at NUMATopologyFilter.

Additional info:
[stack@director ]$ openstack flavor show ovn-dpdk
+----------------------------+------------------------------------------------------------------------------------------------------------------------+
| Field                      | Value                                                                                                                  |
+----------------------------+------------------------------------------------------------------------------------------------------------------------+
| OS-FLV-DISABLED:disabled   | False                                                                                                                  |
| OS-FLV-EXT-DATA:ephemeral  | 0                                                                                                                      |
| access_project_ids         | None                                                                                                                   |
| description                | None                                                                                                                   |
| disk                       | 20                                                                                                                     |
| extra_specs                | {'hw:cpu_policy': 'dedicated', 'hw:emulator_threads_policy': 'isolate', 'hw:mem_page_size': '1GB', 'ovn-dpdk': 'true'} |

| name                       | ovn-dpdk                                                                                                               |
| os-flavor-access:is_public | True                                                                                                                   |
| properties                 | hw:cpu_policy='dedicated', hw:emulator_threads_policy='isolate', hw:mem_page_size='1GB', ovn-dpdk='true'               |
| ram                        | 4096                                                                                                                   |
| rxtx_factor                | 1.0                                                                                                                    |
| swap                       | 0                                                                                                                      |
| vcpus                      | 4                                                                                                                      |
+----------------------------+------------------------------------------------------------------------------------------------------------------------+

Comment 7 smooney 2022-07-18 12:04:19 UTC
updating the title to reject that this is being used to track improving logging.

tl;dr
the original bug report was invalid because the customer did not actually have enough space to boot all the vms they wanted on the host in question.
however while debugging this we noticed that  _numa_cells_support_network_metadata does not have any logging so when it eliminates a host cell
because the numa aware switch feature is in use there is not log to indicate that. As such it makes debugging scheduling issues related to numa
aware vswitchs very difficult without intimate knowledge of the code. we can improve this trivially by adding logging at debug and or info level
when a cell is eliminated.

Comment 9 Artom Lifshitz 2022-10-04 19:24:03 UTC
I'm going to convert this to a bug to improve logging in that area of the code, target 16.x because we'll need it for customer cases.

Comment 10 Jorge San Emeterio 2022-10-10 13:26:01 UTC
Upstream bug at:
https://bugs.launchpad.net/nova/+bug/1751784

Comment 12 Artom Lifshitz 2023-06-05 18:42:11 UTC
I think aiming for 16.2.6 with this is realistic, given how small the patch is.