Bug 1188200 - hotplugged vcpu is not consistent with guest NUMA topology
Summary: hotplugged vcpu is not consistent with guest NUMA topology
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev
Version: 7.1
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Igor Mammedov
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On: 1162080
Blocks: 1188205
TreeView+ depends on / blocked
 
Reported: 2015-02-02 10:28 UTC by Jincheng Miao
Modified: 2015-12-04 16:26 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1188205 (view as bug list)
Environment:
Last Closed: 2015-12-04 16:26:39 UTC
Target Upstream Version:


Attachments (Terms of Use)
SRAT table (29.42 KB, text/x-csrc)
2015-02-02 10:28 UTC, Jincheng Miao
no flags Details
srat dump 1 (6.82 KB, text/x-csrc)
2015-09-15 01:53 UTC, Luyao Huang
no flags Details


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2015:2546 normal SHIPPED_LIVE qemu-kvm-rhev bug fix and enhancement update 2015-12-04 21:11:56 UTC

Description Jincheng Miao 2015-02-02 10:28:27 UTC
Created attachment 987053 [details]
SRAT table

Description of problem:
Prepare guest which has NUMA topology, and set current vcpus less than maxcpus.
Hotplugging vcpu is not not consistent with guest NUMA topology user specified in cmdline.

version:
libvirt-1.2.8-15.el7.x86_64
qemu-kvm-rhev-2.1.2-21.el7.x86_64
3.10.0-223.el7.x86_64

How reproducible:
100%

Step to reproduce:
1. setup guest NUMA topology: 2 nodes, each node has 2 vcpus
# virsh edit rhel7
...
  <vcpu placement='auto'>4</vcpu>
...
  <cpu>
    <numa>
      <cell id='0' cpus='0-1' memory='1048576'/>
      <cell id='1' cpus='2-3' memory='1048576'/>
    </numa>
  </cpu>

the qemu cmdline is:
/usr/libexec/qemu-kvm -name rhel7 -S -machine pc-i440fx-rhel7.1.0,accel=kvm,usb=off -m 2048 -realtime mlock=off -smp 2,maxcpus=4,sockets=4,cores=1,threads=1 -object memory-backend-file,prealloc=yes,mem-path=/dev/hugepages1G/libvirt/qemu,size=1024M,id=ram-node0,host-nodes=1,policy=bind -numa node,nodeid=0,cpus=0-1,memdev=ram-node0 -object memory-backend-file,prealloc=yes,mem-path=/dev/hugepages2M/libvirt/qemu,size=1024M,id=ram-node1,host-nodes=1,policy=bind -numa node,nodeid=1,cpus=2-3,memdev=ram-node1 -uuid 1edfafc5-a55a-4396-9595-46e590bfc79a -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/rhel7.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive file=/mnt/jmiao/r71.img,if=none,id=drive-virtio-disk0,format=qcow2 -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=23,id=hostnet0 -device rtl8139,netdev=hostnet0,id=net0,mac=52:54:00:ab:7e:68,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -vnc 127.0.0.1:0 -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 -msg timestamp=on


2. check in guest
<guest> # numactl --hard
available: 2 nodes (0-1)
node 0 cpus: 0 1
node 0 size: 1023 MB
node 0 free: 653 MB
node 1 cpus:
node 1 size: 1023 MB
node 1 free: 995 MB
node distances:
node   0   1
  0:  10  20
  1:  20  10

3. hotplug vcpu2 and vcpu3
# virsh setvcpus rhel7 4

the QMP commands libvirtd used are:
 53.925 > 0x7fa8202381e0 {"execute":"cpu-add","arguments":{"id":2},"id":"libvirt-10"}
 53.943 < 0x7fa8202381e0 {"return": {}, "id": "libvirt-10"}
 53.944 > 0x7fa8202381e0 {"execute":"cpu-add","arguments":{"id":3},"id":"libvirt-11"}
 53.961 < 0x7fa8202381e0 {"return": {}, "id": "libvirt-11"}
 53.962 > 0x7fa8202381e0 {"execute":"query-cpus","id":"libvirt-12"}
 53.971 < 0x7fa8202381e0 {"return": [{"current": true, "CPU": 0, "pc": -2130073539, "halted": false, "thread_id": 23454}, {"current": false, "CPU": 1, "pc": -2127686014, "halted": false, "thread_id": 23456}, {"current": false, "CPU": 2, "pc": 4294967280, "halted": false, "thread_id": 23491}, {"current": false, "CPU": 3, "pc": 4294967280, "halted": false, "thread_id": 23492}], "id": "libvirt-12"}

4. checking in guest
<guest> # numactl --hard
available: 2 nodes (0-1)
node 0 cpus: 0 1 2
node 0 size: 1023 MB
node 0 free: 644 MB
node 1 cpus: 3
node 1 size: 1023 MB
node 1 free: 993 MB
node distances:
node   0   1
  0:  10  20
  1:  20  10

As we could see, vcpu2 is not located in guest NUMA node1.


Additional info:
The SRAT table is attached, and is generated by:
[root@localhost ~]# acpidump > acpidump.bin

[root@localhost ~]# acpixtract ./acpidump.bin 

Intel ACPI Component Architecture
ACPI Binary Table Extraction Utility version 20140926-64 [Sep 29 2014]
Copyright (c) 2000 - 2014 Intel Corporation

Acpi table [DSDT] - 2807 bytes written to dsdt.dat
Acpi table [SSDT] - 3356 bytes written to ssdt.dat

[root@localhost ~]# iasl -e ssdt.dat -d dsdt.dat 

Intel ACPI Component Architecture
ASL Optimizing Compiler version 20140926-64 [Sep 29 2014]
Copyright (c) 2000 - 2014 Intel Corporation

Loading Acpi table from file   dsdt.dat - Length 00002807 (000AF7)
ACPI: DSDT 0x0000000000000000 000AF7 (v01 BOCHS  BXPCDSDT 00000001 BXPC 00000001)
Acpi table [DSDT] successfully installed and loaded
Loading Acpi table from file   ssdt.dat - Length 00003356 (000D1C)
ACPI: SSDT 0x0000000000000000 000D1C (v01 BOCHS  BXPCSSDT 00000001 BXPC 00000001)
Acpi table [SSDT] successfully installed and loaded
Pass 1 parse of [SSDT]
Pass 2 parse of [SSDT]
Pass 1 parse of [DSDT]
Pass 2 parse of [DSDT]
Parsing Deferred Opcodes (Methods/Buffers/Packages/Regions)

Parsing completed

Found 3 external control methods, reparsing with new information
Pass 1 parse of [DSDT]
Pass 2 parse of [DSDT]
Parsing Deferred Opcodes (Methods/Buffers/Packages/Regions)

Parsing completed
Disassembly completed
ASL Output:    dsdt.dsl - 30129 bytes

Comment 1 Eduardo Habkost 2015-02-03 14:51:41 UTC
Can you please check if the fix for bug 1162080 at http://brewweb.devel.redhat.com/brew/taskinfo?taskID=8682975 affects this bug too?

Comment 2 Jincheng Miao 2015-02-04 03:09:08 UTC
(In reply to Eduardo Habkost from comment #1)
> Can you please check if the fix for bug 1162080 at
> http://brewweb.devel.redhat.com/brew/taskinfo?taskID=8682975 affects this
> bug too?

Hi Eduardo,

I tested the package, it doesn't meet this problem:

1. check NUMA topology
<in guest># numactl --hard
available: 2 nodes (0-1)
node 0 cpus: 0 1
node 0 size: 1023 MB
node 0 free: 634 MB
node 1 cpus:
node 1 size: 1023 MB
node 1 free: 996 MB
node distances:
node   0   1 
  0:  10  20 
  1:  20  10 

2. hotplug 2 vcpus
# virsh setvcpus rhel7 4

3. recheck NUMA topology
<in guest># numactl --hard
available: 2 nodes (0-1)
node 0 cpus: 0 1
node 0 size: 1023 MB
node 0 free: 633 MB
node 1 cpus: 2 3
node 1 size: 1023 MB
node 1 free: 993 MB
node distances:
node   0   1 
  0:  10  20 
  1:  20  10

Comment 3 Jincheng Miao 2015-02-04 03:13:03 UTC
One more question,

if I specified the cpus are overlapped, qemu also exports wrong NUMA topology:

guest NUMA node 0: 0-3,8-11,17
guest NUMA node 1: 4-7,12-17

the qemu cmdline likes:
# /usr/libexec/qemu-kvm -name rhel7 -S -machine pc-i440fx-rhel7.1.0,accel=kvm,usb=off -m 2048 -realtime mlock=off -smp 2,maxcpus=19,sockets=19,cores=1,threads=1 -object memory-backend-ram,size=1024M,id=ram-node0,host-nodes=0,policy=preferred -numa node,nodeid=0,cpus=0-3,cpus=8-11,cpus=17,memdev=ram-node0 -object memory-backend-ram,size=1024M,id=ram-node1,host-nodes=0,policy=preferred -numa node,nodeid=1,cpus=4-7,cpus=12-17,memdev=ram-node1 -uuid 1edfafc5-a55a-4396-9595-46e590bfc79a -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/rhel7.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive file=/mnt/jmiao/r71.img,if=none,id=drive-virtio-disk0,format=qcow2 -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=23,id=hostnet0,vhost=on,vhostfd=24 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:ab:7e:68,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -vnc 127.0.0.1:0 -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 -msg timestamp=on

then I hotplugged all 19 vcpus, and checking in guest:
<in guest># numactl --hard
# numactl --hard
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 8 9 10 11
node 0 size: 1023 MB
node 0 free: 593 MB
node 1 cpus: 4 5 6 7 12 13 14
node 1 size: 1023 MB
node 1 free: 983 MB
node distances:
node   0   1 
  0:  10  20 
  1:  20  10 


we could see vcpu #15 #16 #17 are gone.

Comment 4 Eduardo Habkost 2015-02-04 14:20:04 UTC
(In reply to Jincheng Miao from comment #3)
> we could see vcpu #15 #16 #17 are gone.

Is this using the latest official build, or the scratch build from comment #1?

Comment 5 Jincheng Miao 2015-02-04 23:51:44 UTC
(In reply to Eduardo Habkost from comment #4)
> (In reply to Jincheng Miao from comment #3)
> > we could see vcpu #15 #16 #17 are gone.
> 
> Is this using the latest official build, or the scratch build from comment
> #1?

I tested it on scratch build.

Comment 6 Igor Mammedov 2015-02-11 14:43:14 UTC
Confirming that fix for bug 1162080 should fix issue reported here.

In linux kernel:

acpi_numa_processor_affinity_init()
  ...
  if ((pa->flags & ACPI_SRAT_CPU_ENABLED) == 0)
                return;

so it will bail out early without initializing NUMA stuff if CPU is marked as disable in SRAT.

Comment 7 Luyao Huang 2015-03-04 04:56:44 UTC
Also found a issue like this with machine type rhel6.6.0 and rhel6.5.0:

1.prepare a happy vm have settings like this:
...
  <vcpu placement='static' current='2'>4</vcpu>
...
  <os>
    <type arch='x86_64' machine='rhel6.6.0'>hvm</type>
    <boot dev='hd'/>
  </os>
...
  <cpu>
    <numa>
      <cell id='0' cpus='0,2' memory='512000'/>
      <cell id='1' cpus='1,3' memory='512000'/>
    </numa>
  </cpu>

2. start it and check the numa structure in vm:
# virsh start test3
Domain test3 started

IN GUEST:
# numactl --hard
available: 2 nodes (0-1)
node 0 cpus: 0
node 0 size: 499 MB
node 0 free: 410 MB
node 1 cpus: 1
node 1 size: 499 MB
node 1 free: 443 MB
node distances:
node   0   1
  0:  10  20
  1:  20  10

3. hot-plug vcpu to 3:
# virsh setvcpus test3 3


IN GUEST:
# numactl --hard
available: 2 nodes (0-1)
node 0 cpus: 0
node 0 size: 499 MB
node 0 free: 400 MB
node 1 cpus: 1 2
node 1 size: 499 MB
node 1 free: 442 MB
node distances:
node   0   1
  0:  10  20
  1:  20  10

4.hot-plug vcpu to 4:
# virsh setvcpus test3 4

IN GUEST:
# numactl --hard
available: 2 nodes (0-1)
node 0 cpus: 0
node 0 size: 499 MB
node 0 free: 400 MB
node 1 cpus: 1 2 3
node 1 size: 499 MB
node 1 free: 442 MB
node distances:
node   0   1
  0:  10  20
  1:  20  10

and write here to track if it will be fixed after bug 1162080 fixed.

Comment 8 Igor Mammedov 2015-03-11 16:12:39 UTC
(In reply to Luyao Huang from comment #7)
> and write here to track if it will be fixed after bug 1162080 fixed.

bug 1162080 has been fixed, please retest.

Comment 9 Luyao Huang 2015-03-12 03:06:04 UTC
(In reply to Igor Mammedov from comment #8)
> (In reply to Luyao Huang from comment #7)
> > and write here to track if it will be fixed after bug 1162080 fixed.
> 
> bug 1162080 has been fixed, please retest.

retest with qemu-kvm-rhev-2.2.0-5.el7.x86_64 and still get the same result

Comment 10 Igor Mammedov 2015-03-17 16:59:50 UTC
(In reply to Luyao Huang from comment #9)
> (In reply to Igor Mammedov from comment #8)
> > (In reply to Luyao Huang from comment #7)
> > > and write here to track if it will be fixed after bug 1162080 fixed.
> > 
> > bug 1162080 has been fixed, please retest.
> 
> retest with qemu-kvm-rhev-2.2.0-5.el7.x86_64 and still get the same result

could you provide qemu command line that was used?

Comment 11 Eduardo Habkost 2015-04-06 16:18:14 UTC
Still waiting for information requested on comment #10.

Comment 12 Luyao Huang 2015-04-07 01:26:27 UTC
(In reply to Eduardo Habkost from comment #11)
> Still waiting for information requested on comment #10.

Sorry for the delay,I think i missed this comment during check mail box.
This is my VM cmdline:

LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=spice /usr/libexec/qemu-kvm -name test3 -S -machine rhel6.5.0,accel=kvm,usb=off -m 1000 -realtime mlock=off -smp 2,maxcpus=4,sockets=4,cores=1,threads=1 -numa node,nodeid=0,cpus=0,cpus=2,mem=500 -numa node,nodeid=1,cpus=1,cpus=3,mem=500 -uuid 7347d748-f7ce-448f-8d49-3d29c9bcac30 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/test3.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x6 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -device usb-ccid,id=ccid0 -drive file=/var/lib/libvirt/images/r7_latest.img,if=none,id=drive-ide0-0-0,format=raw -device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -netdev tap,fd=24,id=hostnet0 -device rtl8139,netdev=hostnet0,id=net0,mac=52:54:00:1a:cb:3f,bus=pci.0,addr=0x8 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/r6.agent,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel1,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=9,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 -device usb-tablet,id=input0 -spice port=5900,addr=0.0.0.0,disable-ticketing,seamless-migration=on -k en-us -device qxl-vga,id=video0,ram_size=134217728,vram_size=67108864,vgamem_mb=16,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3 -msg timestamp=on

Comment 14 Igor Mammedov 2015-09-03 15:04:12 UTC
it works for me with latest 7.2 qemu-kvm-rhev-2.3.0 where we've got fix via rebase and 7.1-z stream qemu-kvm-rhev-2.1.2-23 where it was backported via bug 1191385

So please retest and close bug.

Comment 17 huiqingding 2015-09-11 07:45:05 UTC
Hi, Igor:

Test latest qemu-kvm-rhev: qemu-kvm-rhev-2.3.0-22.el7.x86_64 and the kernel is 3.10.0-313.el7.x86_64.

Do test same as comment #0 and comment #3, the result is ok.
But test comment #7, the result is failed.

Best regards
Huiqing

The detailed steps of failed as following:
1. boot guest with "-machine rhel6.5.0"
/usr/libexec/qemu-kvm -name test3 -S -machine rhel6.5.0,accel=kvm,usb=off -m 1000 -realtime mlock=off -smp 2,maxcpus=4,sockets=4,cores=1,threads=1 -numa node,nodeid=0,cpus=0,cpus=2,mem=500 -numa node,nodeid=1,cpus=1,cpus=3,mem=500 -uuid 7347d748-f7ce-448f-8d49-3d29c9bcac30 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/test3.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x6 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -device usb-ccid,id=ccid0 -drive file=/home/RHEL-Server-7.2-64-virtio.qcow2,if=none,id=drive-ide0-0-0,format=qcow2 -device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/r6.agent,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel1,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=9,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 -device usb-tablet,id=input0 -spice port=5901,disable-ticketing,seamless-migration=on -k en-us -device qxl-vga,id=video0,ram_size=134217728,vram_size=67108864,vgamem_mb=16,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3 -msg timestamp=on -qmp tcp:0:4445,server,nowait -monitor stdio
-netdev tap,id=hostnet0,vhost=on,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,id=virtio-net-pci0,mac=4e:63:28:bc:b1:25 

2. check numa topology inside guest:
# numactl --hard
available: 2 nodes (0-1)
node 0 cpus: 0
node 0 size: 499 MB
node 0 free: 99 MB
node 1 cpus: 1
node 1 size: 499 MB
node 1 free: 97 MB
node distances:
node   0   1 
  0:  10  20 

3. hotplug two vcpus:
{"execute":"cpu-add","arguments":{"id":2},"id":"libvirt-10"}
{"return": {}, "id": "libvirt-10"}
{"execute":"cpu-add","arguments":{"id":3},"id":"libvirt-11"}
{"return": {}, "id": "libvirt-11"}

4. check numa topology inside guest:
# numactl --hard
available: 2 nodes (0-1)
node 0 cpus: 0
node 0 size: 499 MB
node 0 free: 114 MB
node 1 cpus: 1 2 3
node 1 size: 499 MB
node 1 free: 81 MB
node distances:
node   0   1 
  0:  10  20 
  1:  20  10 

after step4, vcpu 0 and 2 should be in node0 and vcpu 1 and 3 should be node1.

Comment 18 Igor Mammedov 2015-09-11 14:17:27 UTC
(In reply to huiqingding from comment #17)
> Hi, Igor:
> 
> Test latest qemu-kvm-rhev: qemu-kvm-rhev-2.3.0-22.el7.x86_64 and the kernel
> is 3.10.0-313.el7.x86_64.
> 
> Do test same as comment #0 and comment #3, the result is ok.
> But test comment #7, the result is failed.
it works for me as expected with qemu-kvm-rhev-2.3.0-22.el7.x86_64 and 7.2 guest:

/usr/libexec/qemu-kvm  -enable-kvm -m 4G -smp 2,maxcpus=4,sockets=4,cores=1,threads=1 -numa node,nodeid=0,cpus=0,cpus=2 -numa node,nodeid=1,cpus=1,cpus=3 /dev/slow/rhel72 -monitor stdio -machine rhel6.5.0,accel=kvm,usb=off


> 
> Best regards
> Huiqing
> 
> The detailed steps of failed as following:
> 1. boot guest with "-machine rhel6.5.0"
> /usr/libexec/qemu-kvm -name test3 -S -machine rhel6.5.0,accel=kvm,usb=off -m
> 1000 -realtime mlock=off -smp 2,maxcpus=4,sockets=4,cores=1,threads=1 -numa
> node,nodeid=0,cpus=0,cpus=2,mem=500 -numa
> node,nodeid=1,cpus=1,cpus=3,mem=500 -uuid
> 7347d748-f7ce-448f-8d49-3d29c9bcac30 -no-user-config -nodefaults -chardev
> socket,id=charmonitor,path=/var/lib/libvirt/qemu/test3.monitor,server,nowait
> -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown
> -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device
> virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x6 -device
> virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -device
> usb-ccid,id=ccid0 -drive
> file=/home/RHEL-Server-7.2-64-virtio.qcow2,if=none,id=drive-ide0-0-0,
> format=qcow2 -device
> ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1
> -chardev pty,id=charserial0 -device
> isa-serial,chardev=charserial0,id=serial0 -chardev
> socket,id=charchannel0,path=/var/lib/libvirt/qemu/r6.agent,server,nowait
> -device
> virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel0,id=channel0,
> name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel1,name=vdagent
> -device
> virtserialport,bus=virtio-serial0.0,nr=9,chardev=charchannel1,id=channel1,
> name=com.redhat.spice.0 -device usb-tablet,id=input0 -spice
> port=5901,disable-ticketing,seamless-migration=on -k en-us -device
> qxl-vga,id=video0,ram_size=134217728,vram_size=67108864,vgamem_mb=16,bus=pci.
> 0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3 -msg
> timestamp=on -qmp tcp:0:4445,server,nowait -monitor stdio
> -netdev tap,id=hostnet0,vhost=on,script=/etc/qemu-ifup -device
> virtio-net-pci,netdev=hostnet0,id=virtio-net-pci0,mac=4e:63:28:bc:b1:25 
> 
> 2. check numa topology inside guest:
> # numactl --hard
> available: 2 nodes (0-1)
> node 0 cpus: 0
> node 0 size: 499 MB
> node 0 free: 99 MB
> node 1 cpus: 1
> node 1 size: 499 MB
> node 1 free: 97 MB
> node distances:
> node   0   1 
>   0:  10  20 
> 
> 3. hotplug two vcpus:
> {"execute":"cpu-add","arguments":{"id":2},"id":"libvirt-10"}
> {"return": {}, "id": "libvirt-10"}
> {"execute":"cpu-add","arguments":{"id":3},"id":"libvirt-11"}
> {"return": {}, "id": "libvirt-11"}
> 
> 4. check numa topology inside guest:
> # numactl --hard
> available: 2 nodes (0-1)
> node 0 cpus: 0
> node 0 size: 499 MB
> node 0 free: 114 MB
> node 1 cpus: 1 2 3
> node 1 size: 499 MB
> node 1 free: 81 MB
> node distances:
> node   0   1 
>   0:  10  20 
>   1:  20  10 
> 
> after step4, vcpu 0 and 2 should be in node0 and vcpu 1 and 3 should be
> node1.

Comment 19 Igor Mammedov 2015-09-14 09:13:43 UTC
huiqingding,

Could you dump SRAT ACPI table from guest for the case where it doesn't work for you and attach complete guest's log.

Comment 20 Luyao Huang 2015-09-15 01:52:20 UTC
I have test it on again and still can reproduce it with qemu-kvm-rhev-2.3.0-21.el7.x86_64 libvirt-1.2.17-8.el7.x86_64 and guest kernel is 3.10.0-314.el7.x86_64:

1.
# virsh dumpxml rhel7.0-rhel
...
  <vcpu placement='auto' current='3'>4</vcpu>
...
  <os>
    <type arch='x86_64' machine='rhel6.6.0'>hvm</type>
    <boot dev='hd'/>
  </os>
...
  <cpu mode='custom' match='exact'>
    <model fallback='allow'>Opteron_G5</model>
    <numa>
      <cell id='0' cpus='0,2' memory='2024448' unit='KiB'/>
      <cell id='1' cpus='1,3' memory='2024448' unit='KiB'/>
    </numa>
  </cpu>
...


2. start guest:

# virsh start rhel7.0-rhel
Domain rhel7.0-rhel started

3. hot-plug cpu:

# virsh setvcpus rhel7.0-rhel 4


4. check in guest:
IN GUEST:
# numactl --hard
available: 2 nodes (0-1)
node 0 cpus: 0 2 3
node 0 size: 1976 MB
node 0 free: 1420 MB
node 1 cpus: 1
node 1 size: 1976 MB
node 1 free: 1691 MB
node distances:
node   0   1 
  0:  10  20 
  1:  20  10 

5. and i notice this in guest dmesg :

[   20.508940] smpboot: Booting Node 1 Processor 3 APIC 0x3

6. i will attach the srat.dsl:

7. guest cmdline:

/usr/libexec/qemu-kvm -name rhel7.0-rhel -S -machine rhel6.6.0,accel=kvm,usb=off -cpu Opteron_G5 -m 3954 -realtime mlock=off -smp 3,maxcpus=4,sockets=4,cores=1,threads=1 -object iothread,id=iothread1 -numa node,nodeid=0,cpus=0,cpus=2,mem=1977 -numa node,nodeid=1,cpus=1,cpus=3,mem=1977 -uuid 67c7a123-5415-4136-af62-a2ee098ba6cd -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-rhel7.0-rhel/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -boot strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x6.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x6 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x6.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x6.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive file=/fs/r7_ext4.raw,if=none,id=drive-virtio-disk0,format=raw -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=24,id=hostnet0,vhost=on,vhostfd=25 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:af:19:fb,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev spicevmc,id=charchannel0,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 -chardev socket,id=charchannel1,path=/var/lib/libvirt/qemu/r6.agent,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=org.qemu.guest_agent.0 -device usb-tablet,id=input0 -spice port=5900,addr=127.0.0.1,disable-ticketing,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vgamem_mb=16,bus=pci.0,addr=0x2 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev pty,id=charredir0 -device usb-redir,chardev=charredir0,id=redir0 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8 -msg timestamp=on

Comment 21 Luyao Huang 2015-09-15 01:53:28 UTC
Created attachment 1073470 [details]
srat dump 1

Comment 22 huiqingding 2015-09-15 05:12:27 UTC
(In reply to Igor Mammedov from comment #19)
> huiqingding,
> 
> Could you dump SRAT ACPI table from guest for the case where it doesn't work
> for you and attach complete guest's log.

After hotplug two vcpus as comment #17, inside guest:
# acpidump > acpidump.bin
[root@dhcp-10-17 ~]# acpixtract ./acpidump.bin 

Intel ACPI Component Architecture
ACPI Binary Table Extraction Utility version 20150619-64
Copyright (c) 2000 - 2015 Intel Corporation

Acpi table [DSDT] - 4405 bytes written to dsdt.dat
Acpi table [SSDT] - 2364 bytes written to ssdt.dat
[root@dhcp-10-17 ~]# iasl -e ssdt.dat -d dsdt.dat

Intel ACPI Component Architecture
ASL+ Optimizing Compiler version 20150619-64
Copyright (c) 2000 - 2015 Intel Corporation

Reading ACPI table from file   dsdt.dat - Length 00004405 (0x001135)
ACPI: DSDT 0x0000000000000000 001135 (v01 BOCHS  BXPCDSDT 00000001 BXPC 00000001)
Acpi table [DSDT] successfully installed and loaded
Reading ACPI table from file   ssdt.dat - Length 00002364 (0x00093C)
ACPI: SSDT 0x0000000000000000 00093C (v01 BOCHS  BXPCSSDT 00000001 BXPC 00000001)
Acpi table [SSDT] successfully installed and loaded
Pass 1 parse of [SSDT]
Pass 2 parse of [SSDT]
Pass 1 parse of [DSDT]
Pass 2 parse of [DSDT]
Parsing Deferred Opcodes (Methods/Buffers/Packages/Regions)

Parsing completed

Found 2 external control methods, reparsing with new information
Pass 1 parse of [DSDT]
Pass 2 parse of [DSDT]
Parsing Deferred Opcodes (Methods/Buffers/Packages/Regions)

Parsing completed
Disassembly completed
ASL Output:    dsdt.dsl - 48923 bytes
[root@dhcp-10-17 ~]# 

PS: the info of the host is as following:
# lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                32
On-line CPU(s) list:   0-31
Thread(s) per core:    2
Core(s) per socket:    8
Socket(s):             2
NUMA node(s):          4
Vendor ID:             AuthenticAMD
CPU family:            21
Model:                 1
Model name:            AMD Opteron(TM) Processor 6272
Stepping:              2
CPU MHz:               2100.093
BogoMIPS:              4199.78
Virtualization:        AMD-V
L1d cache:             16K
L1i cache:             64K
L2 cache:              2048K
L3 cache:              6144K
NUMA node0 CPU(s):     0,2,4,6,8,10,12,14
NUMA node1 CPU(s):     16,18,20,22,24,26,28,30
NUMA node2 CPU(s):     1,3,5,7,9,11,13,15
NUMA node3 CPU(s):     17,19,21,23,25,27,29,31

Comment 24 huiqingding 2015-09-17 09:34:47 UTC
Reproduect this bug using:
kernel-3.10.0-316.el7.x86_64
qemu-kvm-rhev-2.1.2-23.el7.x86_64

The detailed steps of failed as following:
1. boot guest with two numa nodes
/usr/libexec/qemu-kvm -name rhel7 -S -machine pc-i440fx-rhel7.1.0,accel=kvm,usb=off -m 2048 -realtime mlock=off -smp 2,maxcpus=4,sockets=4,cores=1,threads=1 -object memory-backend-file,prealloc=yes,mem-path=/home/kvm_hugepage,size=1024M,id=ram-node0,host-nodes=1,policy=bind -numa node,nodeid=0,cpus=0-1,memdev=ram-node0 -object memory-backend-file,prealloc=yes,mem-path=/home/kvm_hugepage,size=1024M,id=ram-node1,host-nodes=1,policy=bind -numa node,nodeid=1,cpus=2-3,memdev=ram-node1 -uuid 1edfafc5-a55a-4396-9595-46e590bfc79a -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/rhel7.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive file=/mnt/rhel7.2.raw,if=none,id=drive-ide0-0-0,format=raw -device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1   -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -vnc :0 -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 -msg timestamp=on -netdev tap,id=hostnet0,vhost=on,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,id=virtio-net-pci0,mac=4e:63:28:bc:b1:25 -monitor stdio  -qmp tcp:0:4445,server,nowait

2. check numa topology inside guest:
# numactl --hard
available: 2 nodes (0-1)
node 0 cpus: 0 1
node 0 size: 1023 MB
node 0 free: 605 MB
node 1 cpus:
node 1 size: 1023 MB
node 1 free: 996 MB
node distances:
node   0   1 
  0:  10  20 
  1:  20  10 


3. hotplug two vcpus:
{"execute":"cpu-add","arguments":{"id":2},"id":"libvirt-10"}
{"return": {}, "id": "libvirt-10"}
{"execute":"cpu-add","arguments":{"id":3},"id":"libvirt-11"}
{"return": {}, "id": "libvirt-11"}

4. check numa topology inside guest:
# numactl --hard
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3
node 0 size: 1023 MB
node 0 free: 605 MB
node 1 cpus:
node 1 size: 1023 MB
node 1 free: 996 MB
node distances:
node   0   1 
  0:  10  20 
  1:  20  10 

after step4, vcpu 2 and 3 should be in node1.

Comment 25 huiqingding 2015-09-17 09:45:53 UTC
Test this bug using:
kernel-3.10.0-316.el7.x86_64
qemu-kvm-rhev-2.3.0-23.el7.x86_64

Test machine type "-rhel7.1.0" and "-rhel7.2.0". The test steps is same as comment #24, the results are pass. After step4, vcpu 2 and 3 is in node1:
available: 2 nodes (0-1)
node 0 cpus: 0 1
node 0 size: 1023 MB
node 0 free: 604 MB
node 1 cpus: 2 3
node 1 size: 1023 MB
node 1 free: 993 MB
node distances:
node   0   1 
  0:  10  20 
  1:  20  10

Comment 26 huiqingding 2015-09-18 01:50:25 UTC
Based on comment #22 and comment 25, I think this bug has been fixed. thanks.

Comment 27 juzhang 2015-09-18 01:52:46 UTC
Set this bz as verified according to comment24 to comment26.

Comment 29 errata-xmlrpc 2015-12-04 16:26:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-2546.html


Note You need to log in before you can comment on or make changes to this bug.