Bug 1192360
| Summary: | [ppc] virsh nodeinfo show the wrong cpu cores and numa cells | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Luyao Huang <lhuang> |
| Component: | libvirt | Assignee: | Andrea Bolognani <abologna> |
| Status: | CLOSED NOTABUG | QA Contact: | Virtualization Bugs <virt-bugs> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 7.1 | CC: | dgibson, dyuan, michen, mzhan, ngu, pkrempa, rbalakri, weizhan, xuhan, ypu |
| Target Milestone: | rc | Keywords: | Reopened |
| Target Release: | --- | ||
| Hardware: | ppc64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2015-04-28 08:26:05 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Luyao Huang
2015-02-13 09:05:14 UTC
What you're seeing, while confusing, is actually the expected and documented
behavior.
From <libvirt/libvirt-host.h>:
struct _virNodeInfo {
/* [...] */
unsigned int nodes; /* the number of NUMA cell, 1 for unusual NUMA
topologies or uniform memory access; check
capabilities XML for the actual NUMA topology */
unsigned int sockets; /* number of CPU sockets per node if nodes > 1,
1 in case of unusual NUMA topology */
unsigned int cores; /* number of cores per socket, total number of
processors in case of unusual NUMA topology*/
unsigned int threads; /* number of threads per core, 1 in case of
unusual numa topology */
};
Here "unusual NUMA topology" really means any situation where the Linux kernel
is not exposing enough information for libvirt to figure out the complete NUMA
topology, which happens not only on PPC64 but also on other architectures
where the offlining of CPUs is supported.
Here's the situation reported by my laptop (dual-core Intel processor with
Hyperthreading support) when all CPUs are online:
[abologna@pandorica ~]$ virsh nodeinfo
CPU(s): 4
CPU socket(s): 1
Core(s) per socket: 2
Thread(s) per core: 2
NUMA cell(s): 1
[abologna@pandorica ~]$ lscpu
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 2
Core(s) per socket: 2
Socket(s): 1
NUMA node(s): 1
Both virsh and lscpu report the correct topology information. I have edited
the output to remove information not relevant to the issue at hand.
If I bring offline one thread per core, as to reflect the configuration of the
PowerPC machine, this is what I get:
[abologna@pandorica ~]$ virsh nodeinfo
CPU(s): 2
CPU socket(s): 1
Core(s) per socket: 4
Thread(s) per core: 1
NUMA cell(s): 1
[abologna@pandorica ~]$ lscpu
CPU(s): 4
On-line CPU(s) list: 0,2
Off-line CPU(s) list: 1,3
Thread(s) per core: 1
Core(s) per socket: 2
Socket(s): 1
NUMA node(s): 1
As you can see, both commands are now reporting incorrect topology
information: they're just lying in different ways :)
Closing the bug.
Given that numactl is able to get the right node information, I don't think this can really be CANTFIX, at least for the # of NUMA nodes. The NUMA information really shouldn't be dependent on whether CPUS are all active - AIUI NUMA toplogy is tied to the sockets/cores/threads heirarchy on x86, but that's not the case on power. David recommended adding the following in-depth information to the bug
report and closing it again as NOTABUG.
---
libvirt obtains the data stored into a virNodeInfo object, the same data
that is eventually displayed to the user when virsh nodeinfo is called,
by looking at the contents of /sys/devices/system/node.
The topology information comes from the files in
/sys/devices/system/node/node*/cpu*/topology, but that data is not
available when a CPU is offline, which means that when SMT is off it
obtains the following information about the system:
nodes: 4
sockets: 1
cores: 5
threads: 1
cpus: 20
which is a decent approximation of the actual topology. However, near
the end of the linuxNodeInfoCPUPopulate() function, which is called by
nodeGetInfo(), we have the following code:
/* Now check if the topology makes sense. There are machines that
* don't expose their real number of nodes or for example the AMD
* Bulldozer architecture that exposes their Clustered integer core
* modules as both threads and cores. This approach throws off our
* detection. Unfortunately the nodeinfo structure isn't designed to
* carry the full topology so we're going to lie about the detected
* topology to notify the user to check the host capabilities for
* the actual topology. */
if ((nodeinfo->nodes *
nodeinfo->sockets *
nodeinfo->cores *
nodeinfo->threads) != (nodeinfo->cpus + offline)) {
nodeinfo->nodes = 1;
nodeinfo->sockets = 1;
nodeinfo->cores = nodeinfo->cpus + offline;
nodeinfo->threads = 1;
}
In our case:
nodes * sockets * cores * threads == cpus + offline
4 * 1 * 5 * 1 == 20 + 140
which obviously doesn't add up, which in turn means the virNodeInfo
object actually ends up looking like this:
nodes: 1
sockets: 1
cores: 160
threads: 1
cpus: 20
I talked to Jirka and he confirmed that this is expected and that
well-behaved, non legacy applications are supposed to disregard the
information stored in virNodeInfo and look up the detailed topology
described in the XML capabilities whenever nodeinfo->nodes == 1.
|