Hide Forgot
Description of problem: Start a qemu-kvm instance with: 1. -smp 2,maxcpus=4,sockets=4,cores=1,threads=1 2. -numa node,nodeid=0,cpus=0-1,memdev=ram-node0 -numa node,nodeid=1,cpus=2-3,memdev=ram-node1 Check numa topology: (qemu) info numa 2 nodes node 0 cpus: 0 1 node 0 size: 256 MB node 1 cpus: node 1 size: 256 MB then after guest boots up, hot plug two vCPUs to guest and re-check numa (qemu) info numa 2 nodes node 0 cpus: 0 1 2 3 <------ vcpu 2 and vcpu 3 are showed in node 0 node 0 size: 256 MB node 1 cpus: <------ they should be showed here node 1 size: 256 MB In guest the topology of numa is correct even after hot adding vCPUs This bug is not a regression it is still reproducible on RHEL 7.1.z qemu-kvm-rhev Version-Release number of selected component (if applicable): How reproducible: 100% Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: slice of qemu-kvm cli: -m 512 -realtime mlock=off -smp 2,maxcpus=4,sockets=4,cores=1,threads=1 -object memory-backend-file,prealloc=yes,mem-path=/mnt/hugepage,size=256M,id=ram-node0,host-nodes=1,policy=bind -numa node,nodeid=0,cpus=0-1,memdev=ram-node0 -object memory-backend-file,prealloc=yes,mem-path=/mnt/hugepage,size=256M,id=ram-node1,host-nodes=1,policy=bind -numa node,nodeid=1,cpus=2-3,memdev=ram-node1
Upstream fix to be in QEMU 2.9: commit 9c6703fe82b29909cf2cf35b192892327841f71a Author: Dou Liyang <douly.fnst@cn.fujitsu.com> Date: Tue Jan 17 22:42:31 2017 +0800 vl: Ensure the numa_post_machine_init func in the appropriate location In the numa_post_machine_init(), we use CPU_FOREACH macro to set all CPUs' namu_node. So, we should make sure that we call it after Qemu has already initialied all the CPUs. As we all know, the CPUs can be created by "-smp"(pc_new_cpu) or "-device"(qdev_device_add) command. But, before the device init, Qemu execute the numa_post_machine_init earlier. It makes the mapping of NUMA nodes and CPUs incorrect. The patch move the numa_post_machine_init func in the appropriate location. Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com> Message-Id: <1484664152-24446-2-git-send-email-douly.fnst@cn.fujitsu.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Still hit the issue with qemu-kvm-rhev-2.9.0-2.el7. Details: qemu-kvm-rhev-2.9.0-2.el7 kernel-3.10.0-661.el7.x86_64 # /usr/libexec/qemu-kvm -m 4G,maxmem=20G,slots=30 \ -cpu Haswell-noTSX -smp 4,maxcpus=8 \ -numa node,mem=2G,cpus=0-3 \ -numa node,mem=2G,cpus=4-7 \ -monitor stdio -vnc :0 -qmp tcp:0:4444,server,nowait rhel74-1.qcow2 After hotplug 2 vcpus, guest numa topology is correct, but 'info numa' in hmp shows different: (qemu) info numa 2 nodes node 0 cpus: 0 1 2 3 5 4 node 0 size: 2048 MB node 1 cpus: node 1 size: 2048 MB
Changed to assign status according to comment6
This one-line patch should fix it: diff --git a/monitor.c b/monitor.c index d4fbc07c5e..11eb48b230 100644 --- a/monitor.c +++ b/monitor.c @@ -1704,7 +1704,7 @@ static void hmp_info_numa(Monitor *mon, const QDict *qdict) for (i = 0; i < nb_numa_nodes; i++) { monitor_printf(mon, "node %d cpus:", i); CPU_FOREACH(cpu) { - if (cpu->numa_node == i) { + if (numa_get_node_for_cpu(cpu->cpu_index) == i) { monitor_printf(mon, " %d", cpu->cpu_index); } } But the existing "-numa cpu" series from Igor would also address that. Igor, do you think we are going to backport your series once it's accepted upstream? In this case, we don't need to do anything and just wait for Igor's series. If we don't backport Igor's series, then we could apply the above one-line fix downstream-only, because the code will be very different upstream.
(In reply to Eduardo Habkost from comment #8) > This one-line patch should fix it: > > diff --git a/monitor.c b/monitor.c > index d4fbc07c5e..11eb48b230 100644 > --- a/monitor.c > +++ b/monitor.c > @@ -1704,7 +1704,7 @@ static void hmp_info_numa(Monitor *mon, const QDict > *qdict) > for (i = 0; i < nb_numa_nodes; i++) { > monitor_printf(mon, "node %d cpus:", i); > CPU_FOREACH(cpu) { > - if (cpu->numa_node == i) { > + if (numa_get_node_for_cpu(cpu->cpu_index) == i) { > monitor_printf(mon, " %d", cpu->cpu_index); > } > } > > > But the existing "-numa cpu" series from Igor would also address that. Igor, > do you think we are going to backport your series once it's accepted > upstream? In this case, we don't need to do anything and just wait for > Igor's series. > > If we don't backport Igor's series, then we could apply the above one-line > fix downstream-only, because the code will be very different upstream. It's too late for backport, current plan its that, it will inherited with rebase in 7.5
Fix included in qemu-kvm-rhev-2.9.0-8.el7
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:2392