Bug 1822841 - Numa popup window socket number do not match the hardware or actual numa node.
Summary: Numa popup window socket number do not match the hardware or actual numa node.
Keywords:
Status: CLOSED DUPLICATE of bug 1694711
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 4.3.7
Hardware: All
OS: Linux
unspecified
low
Target Milestone: ovirt-4.4.4
: ---
Assignee: Lucia Jelinkova
QA Contact: Polina
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-04-10 03:58 UTC by Germano Veit Michel
Modified: 2023-09-07 22:45 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-10-20 16:56:32 UTC
oVirt Team: Virt
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHV-36679 0 None None None 2022-07-27 14:42:02 UTC
Red Hat Knowledge Base (Solution) 4980551 0 None None None 2020-04-12 21:04:57 UTC

Description Germano Veit Michel 2020-04-10 03:58:24 UTC
Description of problem:

Customer reported that sometimes the NUMA window that is used to pin VMs to Host nodes loads with incorrect Socket vs Numa node mappings. Rebooting the host changes the order (due to refresh caps).
From what I can gather, VDSM does not return any info to the engine on getCapabilities to determine the socket number related to each numa node, and the engine is assuming the socket number from an
index to access a list.

Expected:
Socket 1 -> Numa 0
Socket 2 -> Numa 1
Socket 3 -> Numa 2
Socket 4 -> Numa 3

Shown (see screenshot)
Socket 0 -> Numa 0
Socket 3 -> Numa 2
Socket 1 -> Numa 3
Socket 2 -> Numa 1

engine=> select numa_node_id,vds_id,numa_node_index from numa_node where vds_id = 'cb368185-df85-4380-a8d7-7f9d2c67bcab';
             numa_node_id             |                vds_id                | numa_node_index 
--------------------------------------+--------------------------------------+-----------------
 8be568e4-9486-4dd7-883a-478c43d05902 | cb368185-df85-4380-a8d7-7f9d2c67bcab |               0
 44a6d21c-a216-4edb-b88a-9a2e698a47da | cb368185-df85-4380-a8d7-7f9d2c67bcab |               1
 91ec6c6a-05fd-4eb6-b40e-4231d536c6c8 | cb368185-df85-4380-a8d7-7f9d2c67bcab |               2
 b249c691-dc7c-4703-b501-82a8b39deadf | cb368185-df85-4380-a8d7-7f9d2c67bcab |               3
(4 rows)

engine=> select numa_node_id,vds_id,numa_node_index from numa_node_cpus_view where vds_id = 'cb368185-df85-4380-a8d7-7f9d2c67bcab';
             numa_node_id             |                vds_id                | numa_node_index 
--------------------------------------+--------------------------------------+-----------------
 8be568e4-9486-4dd7-883a-478c43d05902 | cb368185-df85-4380-a8d7-7f9d2c67bcab |               0
 b249c691-dc7c-4703-b501-82a8b39deadf | cb368185-df85-4380-a8d7-7f9d2c67bcab |               3
 44a6d21c-a216-4edb-b88a-9a2e698a47da | cb368185-df85-4380-a8d7-7f9d2c67bcab |               1
 91ec6c6a-05fd-4eb6-b40e-4231d536c6c8 | cb368185-df85-4380-a8d7-7f9d2c67bcab |               2
(4 rows)

I'm not entirely sure, but the engine seems to come up with the socket numbers here, based on an integer to access a list:
https://github.com/oVirt/ovirt-engine/blob/master/frontend/webadmin/modules/gwt-common/src/main/java/org/ovirt/engine/ui/common/presenter/popup/numa/NumaSupportPopupPresenterWidget.java#L103

That 'i' seems to end up being the socket number. 
https://github.com/oVirt/ovirt-engine/blob/60d2e5b9bbb775242a9ea01aa88d6761faa2a08b/frontend/webadmin/modules/uicommonweb/src/main/java/org/ovirt/engine/ui/uicommonweb/models/hosts/numa/NumaSupportModel.java#L146

So it seems the so the socket to numa node association depends on the order of the list..

Version-Release number of selected component (if applicable):
rhvm-4.3.7.2-0.1.el7.noarch

How reproducible:
Unclear, apparently happens if the stored procedure returns the NUMA nodes not ordered

Comment 2 Arik 2020-10-20 16:56:32 UTC

*** This bug has been marked as a duplicate of bug 1694711 ***


Note You need to log in before you can comment on or make changes to this bug.