Bug 1377063 - Guest numa topology not correct after hot plug-unplug-plug vcpus
Summary: Guest numa topology not correct after hot plug-unplug-plug vcpus
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev
Version: 7.3
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Igor Mammedov
QA Contact: Yumei Huang
URL:
Whiteboard:
Keywords:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-09-18 06:54 UTC by Luyao Huang
Modified: 2017-08-02 03:29 UTC (History)
12 users (show)

(edit)
Clone Of:
(edit)
Last Closed: 2017-08-01 23:34:44 UTC


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2017:2392 normal SHIPPED_LIVE Important: qemu-kvm-rhev security, bug fix, and enhancement update 2017-08-01 20:04:36 UTC

Description Luyao Huang 2016-09-18 06:54:05 UTC
Description of problem:
Guest numa topology not correct after hot plug-unplug-plug vcpus

Version-Release number of selected component (if applicable):
qemu-kvm-rhev-2.6.0-25.el7.x86_64

Guest:
kernel-3.10.0-505.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. prepare a guest xml like this:

# virsh dumpxml r7
<domain type='kvm'>
  <name>r7</name>
  <uuid>67c7a123-5415-4136-af62-a2ee098ba6cd</uuid>
  <maxMemory slots='16' unit='KiB'>15242882</maxMemory>
  <memory unit='KiB'>1179648</memory>
  <currentMemory unit='KiB'>1179648</currentMemory>
  <vcpu placement='static' current='3'>10</vcpu>
  <vcpus>
    <vcpu id='0' enabled='yes' hotpluggable='no' order='1'/>
    <vcpu id='1' enabled='yes' hotpluggable='yes' order='3'/>
    <vcpu id='2' enabled='yes' hotpluggable='yes' order='2'/>
    <vcpu id='3' enabled='no' hotpluggable='yes'/>
    <vcpu id='4' enabled='no' hotpluggable='yes'/>
    <vcpu id='5' enabled='no' hotpluggable='yes'/>
    <vcpu id='6' enabled='no' hotpluggable='yes'/>
    <vcpu id='7' enabled='no' hotpluggable='yes'/>
    <vcpu id='8' enabled='no' hotpluggable='yes'/>
    <vcpu id='9' enabled='no' hotpluggable='yes'/>
  </vcpus>

2. start guest:
# virsh start r7
Domain r7 started

3. check qemu cmdline and qmp:

# ps aux|grep qemu
qemu     26636  124  1.2 1914384 400948 ?      Sl   02:30   0:21 /usr/libexec/qemu-kvm -name guest=r7,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-25-r7/master-key.aes -machine pc-i440fx-rhel7.3.0,accel=kvm,usb=off -m size=1048576k,slots=16,maxmem=15243264k -realtime mlock=off -smp 1,maxcpus=10,sockets=10,cores=1,threads=1 -numa node,nodeid=0,cpus=0,cpus=2-5,mem=512 -numa node,nodeid=1,cpus=1,cpus=6-9,mem=512 -object memory-backend-ram,id=memdimm0,size=134217728 -device pc-dimm,node=0,memdev=memdimm0,id=dimm0 -uuid 67c7a123-5415-4136-af62-a2ee098ba6cd -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-25-r7/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -boot strict=on -device pxb,bus_nr=254,id=pci.1,numa_node=1,bus=pci.0,addr=0xc -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x6.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x6 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x6.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x6.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive file=/var/lib/libvirt/images/RHEL-7.3-latest.qcow2,format=qcow2,if=none,id=drive-virtio-disk0 -device virtio-blk-pci,scsi=off,bus=pci.1,addr=0x7,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=26,id=hostnet0,vhost=on,vhostfd=28 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:af:19:fb,bus=pci.1,addr=0xb -chardev pty,id=charserial0 -device pci-serial,chardev=charserial0,id=serial0,bus=pci.0,addr=0x9 -chardev spicevmc,id=charchannel0,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 -chardev socket,id=charchannel1,path=/var/lib/libvirt/qemu/123.sock,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=org.qemu.guest_agent.0 -device usb-tablet,id=input0,bus=usb.0,port=1 -device virtio-keyboard-pci,id=input1,bus=pci.0,addr=0x3 -device virtio-mouse-pci,id=input2,bus=pci.0,addr=0xa -spice port=5900,addr=127.0.0.1,disable-ticketing,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,bus=pci.0,addr=0x2 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev pty,id=charredir0 -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8 -msg timestamp=on


# stap qemu-monitor.stp
  0.000 begin
...
17316.760 > 0x7fc3e8002140 {"execute":"device_add","arguments":{"driver":"qemu64-x86_64-cpu","id":"vcpu2","socket-id":2,"core-id":0,"thread-id":0},"id":"libvirt-8"}
17316.768 < 0x7fc3e8002140 {"return": {}, "id": "libvirt-8"}
17316.769 > 0x7fc3e8002140 {"execute":"device_add","arguments":{"driver":"qemu64-x86_64-cpu","id":"vcpu1","socket-id":1,"core-id":0,"thread-id":0},"id":"libvirt-9"}
17316.776 < 0x7fc3e8002140 {"return": {}, "id": "libvirt-9"}

4. check guest numa topology:

IN GUEST:
# numactl --har
available: 2 nodes (0-1)
node 0 cpus: 0 2
node 0 size: 639 MB
node 0 free: 476 MB
node 1 cpus: 1
node 1 size: 511 MB
node 1 free: 351 MB
node distances:
node   0   1 
  0:  10  20 
  1:  20  10 

5. hot-plug vcpus:

# virsh setvcpus r7 10

IN GUEST:
# numactl --har
available: 2 nodes (0-1)
node 0 cpus: 0 2 3 4 5
node 0 size: 639 MB
node 0 free: 467 MB
node 1 cpus: 1 6 7 8 9
node 1 size: 511 MB
node 1 free: 348 MB
node distances:
node   0   1 
  0:  10  20 
  1:  20  10 

6. hot-unplug vcpus:
# virsh setvcpus r7 3

IN GUEST:

# numactl --har
available: 2 nodes (0-1)
node 0 cpus: 0 2
node 0 size: 639 MB
node 0 free: 443 MB
node 1 cpus: 1
node 1 size: 511 MB
node 1 free: 382 MB
node distances:
node   0   1 
  0:  10  20 
  1:  20  10 

7. hot-plug vcpus:

# virsh setvcpus r7 10

IN GUEST:
# numactl --har
available: 2 nodes (0-1)
node 0 cpus: 0 2 3 4 5 6 7 8 9
node 0 size: 639 MB
node 0 free: 441 MB
node 1 cpus: 1
node 1 size: 511 MB
node 1 free: 382 MB
node distances:
node   0   1 
  0:  10  20 
  1:  20  10 


Actual results:

In step 7, guest numa topology doesn't match the numa topology in guest xml (or qemu cmd line)

Expected results:

In step 7:

IN GUEST:
# numactl --har
available: 2 nodes (0-1)
node 0 cpus: 0 2 3 4 5
node 0 size: 639 MB
node 0 free: 467 MB
node 1 cpus: 1 6 7 8 9
node 1 size: 511 MB
node 1 free: 348 MB
node distances:
node   0   1 
  0:  10  20 
  1:  20  10 

Additional info:

Comment 2 Igor Mammedov 2016-09-19 12:00:51 UTC
Fix has been posted during 2.7 merge windows but somehow fell through the cracks.
I'll rebase and try to merge it into 2.8.

Moving it to 7.4 as it's too late for 7.3 and it's not regression as it's been always that way.

More exactly it's guest kernel issue as it forgets mapping info from SRAT,
but since it's not fixable in old guests we can workaround it from QEMU side
providing dynamic mapping via per cpu _PXM method.

Comment 3 Igor Mammedov 2017-01-02 09:24:48 UTC
Fixed upstream in qemu-2.8:
 commit 271119313 acpi: provide _PXM method for CPU devices if QEMU is started numa enabled.

Pls, retest once rebase to 2.8 is complete.

Comment 7 jingzhao 2017-06-02 08:36:53 UTC
[1] qemu command

/usr/libexec/qemu-kvm \
-machine pc \
-nodefaults -rtc base=utc \
-m size=1048576k,slots=16,maxmem=15243264k \
-realtime mlock=off \
-smp 1,maxcpus=10,sockets=10,cores=1,threads=1 \
-numa node,nodeid=0,cpus=0,cpus=2-5,mem=512 \
-numa node,nodeid=1,cpus=1,cpus=6-9,mem=512 \
-object memory-backend-ram,id=memdimm0,size=134217728 \
-device pc-dimm,node=0,memdev=memdimm0,id=dimm0 \
-enable-kvm \
-uuid 990ea161-6b67-47b2-b803-19fb01d30d12 \
-k en-us \
-nodefaults \
-serial unix:/tmp/serial0,server,nowait \
-boot menu=on \
-qmp tcp:0:6666,server,nowait \
-vga qxl \
-chardev file,path=/home/seabios.log,id=seabios -device isa-debugcon,chardev=seabios,iobase=0x402 \
-drive file=/home/jinzhao/rhel7.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none,werror=stop,rerror=stop \
-device virtio-blk-pci,drive=drive-virtio-disk0,id=virtio-disk0 \
-device virtio-net-pci,netdev=tap10,mac=9a:6a:6b:6c:6d:6f -netdev tap,id=tap10 \
-monitor stdio \
-vnc :0 \

Comment 11 errata-xmlrpc 2017-08-01 23:34:44 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 12 errata-xmlrpc 2017-08-02 01:12:22 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 13 errata-xmlrpc 2017-08-02 02:04:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 14 errata-xmlrpc 2017-08-02 02:45:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 15 errata-xmlrpc 2017-08-02 03:09:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 16 errata-xmlrpc 2017-08-02 03:29:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392


Note You need to log in before you can comment on or make changes to this bug.