Bug 1274567

Summary: HMP doesn't reflect the correct numa topology after hot plugging vCPU
Product: Red Hat Enterprise Linux 7 Reporter: Chao Yang <chayang>
Component: qemu-kvm-rhevAssignee: Eduardo Habkost <ehabkost>
Status: CLOSED ERRATA QA Contact: Guo, Zhiyi <zhguo>
Severity: low Docs Contact:
Priority: low    
Version: 7.2CC: ailan, dgilbert, ehabkost, huding, imammedo, jinzhao, juzhang, mrezanin, virt-maint, xfu, yuhuang
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: qemu-kvm-rhev-2.9.0-8.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-08-01 23:29:42 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Chao Yang 2015-10-23 02:02:10 UTC
Description of problem:
Start a qemu-kvm instance with:
1. -smp 2,maxcpus=4,sockets=4,cores=1,threads=1 
2. -numa node,nodeid=0,cpus=0-1,memdev=ram-node0 
   -numa node,nodeid=1,cpus=2-3,memdev=ram-node1


Check numa topology:

(qemu) info numa 
2 nodes
node 0 cpus: 0 1
node 0 size: 256 MB
node 1 cpus:
node 1 size: 256 MB


then after guest boots up, hot plug two vCPUs to guest and re-check numa
 
(qemu) info numa 
2 nodes
node 0 cpus: 0 1 2 3 <------ vcpu 2 and vcpu 3 are showed in node 0
node 0 size: 256 MB
node 1 cpus:         <------ they should be showed here
node 1 size: 256 MB


In guest the topology of numa is correct even after hot adding vCPUs

This bug is not a regression it is still reproducible on RHEL 7.1.z qemu-kvm-rhev

Version-Release number of selected component (if applicable):


How reproducible:
100%

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

slice of qemu-kvm cli:

-m 512 -realtime mlock=off 
-smp 2,maxcpus=4,sockets=4,cores=1,threads=1 

-object memory-backend-file,prealloc=yes,mem-path=/mnt/hugepage,size=256M,id=ram-node0,host-nodes=1,policy=bind 
-numa node,nodeid=0,cpus=0-1,memdev=ram-node0 

-object memory-backend-file,prealloc=yes,mem-path=/mnt/hugepage,size=256M,id=ram-node1,host-nodes=1,policy=bind 
-numa node,nodeid=1,cpus=2-3,memdev=ram-node1

Comment 4 Eduardo Habkost 2017-01-24 17:23:14 UTC
Upstream fix to be in QEMU 2.9:

commit 9c6703fe82b29909cf2cf35b192892327841f71a
Author: Dou Liyang <douly.fnst.com>
Date:   Tue Jan 17 22:42:31 2017 +0800

    vl: Ensure the numa_post_machine_init func in the appropriate location
    
    In the numa_post_machine_init(), we use CPU_FOREACH macro to set all
    CPUs' namu_node. So, we should make sure that we call it after Qemu
    has already initialied all the CPUs.
    
    As we all know, the CPUs can be created by "-smp"(pc_new_cpu) or
    "-device"(qdev_device_add) command. But, before the device init,
    Qemu execute the numa_post_machine_init earlier. It makes the mapping
    of NUMA nodes and CPUs incorrect.
    
    The patch move the numa_post_machine_init func in the appropriate
    location.
    
    Signed-off-by: Dou Liyang <douly.fnst.com>
    Message-Id: <1484664152-24446-2-git-send-email-douly.fnst.com>
    Reviewed-by: Eduardo Habkost <ehabkost>
    Signed-off-by: Eduardo Habkost <ehabkost>

Comment 6 Yumei Huang 2017-05-04 09:35:40 UTC
Still hit the issue with qemu-kvm-rhev-2.9.0-2.el7.

Details:
qemu-kvm-rhev-2.9.0-2.el7
kernel-3.10.0-661.el7.x86_64


# /usr/libexec/qemu-kvm -m 4G,maxmem=20G,slots=30  \

-cpu Haswell-noTSX -smp 4,maxcpus=8 \

-numa node,mem=2G,cpus=0-3 \

-numa node,mem=2G,cpus=4-7 \

-monitor stdio  -vnc :0 -qmp tcp:0:4444,server,nowait rhel74-1.qcow2


After hotplug 2 vcpus, guest numa topology is correct, but 'info numa' in hmp shows different:

(qemu) info numa
2 nodes
node 0 cpus: 0 1 2 3 5 4
node 0 size: 2048 MB
node 1 cpus:
node 1 size: 2048 MB

Comment 7 jingzhao 2017-05-08 06:37:04 UTC
Changed to assign status according to comment6

Comment 8 Eduardo Habkost 2017-05-08 15:13:09 UTC
This one-line patch should fix it:

diff --git a/monitor.c b/monitor.c
index d4fbc07c5e..11eb48b230 100644
--- a/monitor.c
+++ b/monitor.c
@@ -1704,7 +1704,7 @@ static void hmp_info_numa(Monitor *mon, const QDict *qdict)
     for (i = 0; i < nb_numa_nodes; i++) {
         monitor_printf(mon, "node %d cpus:", i);
         CPU_FOREACH(cpu) {
-            if (cpu->numa_node == i) {
+            if (numa_get_node_for_cpu(cpu->cpu_index) == i) {
                 monitor_printf(mon, " %d", cpu->cpu_index);
             }
         }


But the existing "-numa cpu" series from Igor would also address that. Igor, do you think we are going to backport your series once it's accepted upstream? In this case, we don't need to do anything and just wait for Igor's series.

If we don't backport Igor's series, then we could apply the above one-line fix downstream-only, because the code will be very different upstream.

Comment 10 Igor Mammedov 2017-05-09 12:18:01 UTC
(In reply to Eduardo Habkost from comment #8)
> This one-line patch should fix it:
> 
> diff --git a/monitor.c b/monitor.c
> index d4fbc07c5e..11eb48b230 100644
> --- a/monitor.c
> +++ b/monitor.c
> @@ -1704,7 +1704,7 @@ static void hmp_info_numa(Monitor *mon, const QDict
> *qdict)
>      for (i = 0; i < nb_numa_nodes; i++) {
>          monitor_printf(mon, "node %d cpus:", i);
>          CPU_FOREACH(cpu) {
> -            if (cpu->numa_node == i) {
> +            if (numa_get_node_for_cpu(cpu->cpu_index) == i) {
>                  monitor_printf(mon, " %d", cpu->cpu_index);
>              }
>          }
> 
> 
> But the existing "-numa cpu" series from Igor would also address that. Igor,
> do you think we are going to backport your series once it's accepted
> upstream? In this case, we don't need to do anything and just wait for
> Igor's series.
> 
> If we don't backport Igor's series, then we could apply the above one-line
> fix downstream-only, because the code will be very different upstream.

It's too late for backport, 
current plan its that, it will inherited with rebase in 7.5

Comment 12 Miroslav Rezanina 2017-06-06 08:53:39 UTC
Fix included in qemu-kvm-rhev-2.9.0-8.el7

Comment 16 errata-xmlrpc 2017-08-01 23:29:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 17 errata-xmlrpc 2017-08-02 01:07:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 18 errata-xmlrpc 2017-08-02 01:59:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 19 errata-xmlrpc 2017-08-02 02:40:06 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 20 errata-xmlrpc 2017-08-02 03:04:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 21 errata-xmlrpc 2017-08-02 03:24:58 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392