1192360 – [ppc] virsh nodeinfo show the wrong cpu cores and numa cells

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1192360 - [ppc] virsh nodeinfo show the wrong cpu cores and numa cells

Summary: [ppc] virsh nodeinfo show the wrong cpu cores and numa cells

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	libvirt
Sub Component:
Version:	7.1
Hardware:	ppc64
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	rc
Target Release:	---
Assignee:	Andrea Bolognani
QA Contact:	Virtualization Bugs
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2015-02-13 09:05 UTC by Luyao Huang
Modified:	2015-05-22 13:25 UTC (History)
CC List:	10 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2015-04-28 08:26:05 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Bugzilla	1181465	0	medium	CLOSED	too little cpu information when use virsh capabilities in ppc64	2021-02-22 00:41:40 UTC

Internal Links: 1181465

Description Luyao Huang 2015-02-13 09:05:14 UTC

Description of problem:
virsh nodeinfo show the wrong cpu cores and numa cells

Version-Release number of selected component (if applicable):
For PowerKVM:
libvirt-1.2.5-1.1.pkvm2_1_1.20.33.ppc64
For RHEL7:
libvirt-1.2.8-16.el7.ppc64

How reproducible:
100%

Steps to Reproduce:
1.
# numactl --hard
available: 4 nodes (0-1,16-17)
node 0 cpus: 0 8 16 24 32
node 0 size: 65536 MB
node 0 free: 61255 MB
node 1 cpus: 40 48 56 64 72
node 1 size: 65536 MB
node 1 free: 62835 MB
node 16 cpus: 80 88 96 104 112
node 16 size: 65536 MB
node 16 free: 63189 MB
node 17 cpus: 120 128 136 144 152
node 17 size: 65536 MB
node 17 free: 62661 MB
node distances:
node   0   1  16  17 
  0:  10  20  40  40 
  1:  20  10  40  40 
 16:  40  40  10  20 
 17:  40  40  20  10 

2.# ppc64_cpu --info
Core   0:    0*    1     2     3     4     5     6     7  
Core   1:    8*    9    10    11    12    13    14    15  
Core   2:   16*   17    18    19    20    21    22    23  
Core   3:   24*   25    26    27    28    29    30    31  
Core   4:   32*   33    34    35    36    37    38    39  
Core   5:   40*   41    42    43    44    45    46    47  
Core   6:   48*   49    50    51    52    53    54    55  
Core   7:   56*   57    58    59    60    61    62    63  
Core   8:   64*   65    66    67    68    69    70    71  
Core   9:   72*   73    74    75    76    77    78    79  
Core  10:   80*   81    82    83    84    85    86    87  
Core  11:   88*   89    90    91    92    93    94    95  
Core  12:   96*   97    98    99   100   101   102   103  
Core  13:  104*  105   106   107   108   109   110   111  
Core  14:  112*  113   114   115   116   117   118   119  
Core  15:  120*  121   122   123   124   125   126   127  
Core  16:  128*  129   130   131   132   133   134   135  
Core  17:  136*  137   138   139   140   141   142   143  
Core  18:  144*  145   146   147   148   149   150   151  
Core  19:  152*  153   154   155   156   157   158   159 

3.# virsh nodeinfo
CPU model:           ppc64
CPU(s):              20
CPU frequency:       2061 MHz
CPU socket(s):       1
Core(s) per socket:  160
Thread(s) per core:  1
NUMA cell(s):        1
Memory size:         267652032 KiB


Actual results:

libvirt show the wrong information in this place:
CPU socket(s):       1
Core(s) per socket:  160
Thread(s) per core:  1
NUMA cell(s):        1

Expected results:
should show a more correct information

Additional info:

Comment 1 Andrea Bolognani 2015-04-24 13:10:39 UTC

What you're seeing, while confusing, is actually the expected and documented
behavior.

From <libvirt/libvirt-host.h>:

  struct _virNodeInfo {
    /* [...] */
    unsigned int nodes;   /* the number of NUMA cell, 1 for unusual NUMA
                             topologies or uniform memory access; check
                             capabilities XML for the actual NUMA topology */
    unsigned int sockets; /* number of CPU sockets per node if nodes > 1,
                             1 in case of unusual NUMA topology */
    unsigned int cores;   /* number of cores per socket, total number of
                             processors in case of unusual NUMA topology*/
    unsigned int threads; /* number of threads per core, 1 in case of
                             unusual numa topology */
  };

Here "unusual NUMA topology" really means any situation where the Linux kernel
is not exposing enough information for libvirt to figure out the complete NUMA
topology, which happens not only on PPC64 but also on other architectures
where the offlining of CPUs is supported.

Here's the situation reported by my laptop (dual-core Intel processor with
Hyperthreading support) when all CPUs are online:

  [abologna@pandorica ~]$ virsh nodeinfo
  CPU(s):              4
  CPU socket(s):       1
  Core(s) per socket:  2
  Thread(s) per core:  2
  NUMA cell(s):        1

  [abologna@pandorica ~]$ lscpu
  CPU(s):                4
  On-line CPU(s) list:   0-3
  Thread(s) per core:    2
  Core(s) per socket:    2
  Socket(s):             1
  NUMA node(s):          1

Both virsh and lscpu report the correct topology information. I have edited
the output to remove information not relevant to the issue at hand.

If I bring offline one thread per core, as to reflect the configuration of the
PowerPC machine, this is what I get:

  [abologna@pandorica ~]$ virsh nodeinfo
  CPU(s):              2
  CPU socket(s):       1
  Core(s) per socket:  4
  Thread(s) per core:  1
  NUMA cell(s):        1

  [abologna@pandorica ~]$ lscpu
  CPU(s):                4
  On-line CPU(s) list:   0,2
  Off-line CPU(s) list:  1,3
  Thread(s) per core:    1
  Core(s) per socket:    2
  Socket(s):             1
  NUMA node(s):          1

As you can see, both commands are now reporting incorrect topology
information: they're just lying in different ways :)

Closing the bug.

Comment 2 David Gibson 2015-04-27 02:09:15 UTC

Given that numactl is able to get the right node information, I don't think this can really be CANTFIX, at least for the # of NUMA nodes.

The NUMA information really shouldn't be dependent on whether CPUS are all active - AIUI NUMA toplogy is tied to the sockets/cores/threads heirarchy on x86, but that's not the case on power.

Comment 3 Andrea Bolognani 2015-04-28 08:26:05 UTC

David recommended adding the following in-depth information to the bug
report and closing it again as NOTABUG.

---

libvirt obtains the data stored into a virNodeInfo object, the same data
that is eventually displayed to the user when virsh nodeinfo is called,
by looking at the contents of /sys/devices/system/node.

The topology information comes from the files in
/sys/devices/system/node/node*/cpu*/topology, but that data is not
available when a CPU is offline, which means that when SMT is off it
obtains the following information about the system:

    nodes:   4
    sockets: 1
    cores:   5
    threads: 1
    cpus:    20

which is a decent approximation of the actual topology. However, near
the end of the linuxNodeInfoCPUPopulate() function, which is called by
nodeGetInfo(), we have the following code:

    /* Now check if the topology makes sense. There are machines that
     * don't expose their real number of nodes or for example the AMD
     * Bulldozer architecture that exposes their Clustered integer core
     * modules as both threads and cores. This approach throws off our
     * detection. Unfortunately the nodeinfo structure isn't designed to
     * carry the full topology so we're going to lie about the detected
     * topology to notify the user to check the host capabilities for
     * the actual topology. */
    if ((nodeinfo->nodes *
         nodeinfo->sockets *
         nodeinfo->cores *
         nodeinfo->threads) != (nodeinfo->cpus + offline)) {
        nodeinfo->nodes = 1;
        nodeinfo->sockets = 1;
        nodeinfo->cores = nodeinfo->cpus + offline;
        nodeinfo->threads = 1;
    }

In our case:

    nodes * sockets * cores * threads == cpus + offline
    4     * 1       * 5     * 1       == 20   + 140

which obviously doesn't add up, which in turn means the virNodeInfo
object actually ends up looking like this:

    nodes:   1
    sockets: 1
    cores:   160
    threads: 1
    cpus:    20

I talked to Jirka and he confirmed that this is expected and that
well-behaved, non legacy applications are supposed to disregard the
information stored in virNodeInfo and look up the detailed topology
described in the XML capabilities whenever nodeinfo->nodes == 1.

Note You need to log in before you can comment on or make changes to this bug.