Bug 1609234 - Win2016 guest can't recognize pc-dimm hotplugged to node 0
Summary: Win2016 guest can't recognize pc-dimm hotplugged to node 0
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev
Version: 7.6
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: Igor Mammedov
QA Contact: Yumei Huang
URL:
Whiteboard:
Depends On:
Blocks: 1609235 1636015 1647736
TreeView+ depends on / blocked
 
Reported: 2018-07-27 10:43 UTC by Yumei Huang
Modified: 2018-11-08 09:27 UTC (History)
8 users (show)

Fixed In Version: qemu-kvm-rhev-2.12.0-10.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1609235 1636015 (view as bug list)
Environment:
Last Closed: 2018-11-01 11:13:16 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:3443 None None None 2018-11-01 11:15:34 UTC

Description Yumei Huang 2018-07-27 10:43:11 UTC
Description of problem:
Boot win2016 with two pc-dimms assigned to different numa nodes, then hotplug one pc-dimm to node 0, the hotplugged one can't be recognized.  If hotplug one pc-dimm to node 1, it can be recognized. 

Version-Release number of selected component (if applicable):
qemu-kvm-rhev-2.12.0-7.el7
3.10.0-918.el7.x86_64

How reproducible:
always

Steps to Reproduce:
1. Boot win2016 guest, assign one pc-dimm to node 0 and one to node 1

# /usr/libexec/qemu-kvm \
    -name 'avocado-vt-vm1'  \
    -sandbox off  \
    -machine pc  \
    -nodefaults  \
    -vga std  \
-device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=0x4 \
    -drive id=drive_image1,if=none,snapshot=off,aio=threads,cache=none,format=qcow2,file=/home/kvm_autotest_root/images/win2016-64-virtio-scsi.qcow2 \
    -device scsi-hd,id=image1,drive=drive_image1 \
    -m 4096,slots=4,maxmem=32G \
    -object memory-backend-ram,policy=default,size=1G,id=mem-mem1 \
    -device pc-dimm,node=1,id=dimm-mem1,memdev=mem-mem1 \
    -object memory-backend-ram,policy=default,size=1G,id=mem-mem2 \
    -device pc-dimm,node=0,id=dimm-mem2,memdev=mem-mem2 \
    -smp 16,maxcpus=16,cores=8,threads=1,sockets=2  \
    -numa node,nodeid=0  \
    -numa node,nodeid=1  \
    -cpu 'Opteron_G5',+kvm_pv_unhalt,hv_relaxed,hv_spinlocks=0x1fff,hv_vapic,hv_time \
    -drive id=drive_cd1,if=none,snapshot=off,aio=threads,cache=none,media=cdrom,file=/home/kvm_autotest_root/iso/windows/winutils.iso \
    -device scsi-cd,id=cd1,drive=drive_cd1 \
    -vnc :0  \
    -rtc base=localtime,clock=host,driftfix=slew  \
    -boot menu=off,strict=off,order=cdn,once=c \
    -enable-kvm \
    -monitor stdio


2. Hotplug pc-dimm to node 0

(qemu) object_add memory-backend-ram,id=m0,size=1G,policy=default
(qemu)  device_add pc-dimm,id=d0,memdev=m0,node=0

3. Hotplug pc-dimm to node 1

(qemu) object_add memory-backend-ram,id=m1,size=1G,policy=default
(qemu)  device_add pc-dimm,id=d1,memdev=m1,node=1


Actual results:
After step 1, guest total memory is 6G.
After step 2, guest total memory is still 6G.
After step 3, guest total memory is 7G.

Expected results:
After step 2, guest total memory should be 7G.
After step 3, guest total memory should be 8G. 

Additional info:
It is a regression, can't reproduce with qemu-kvm-rhev-2.10.0-21.el7

Comment 3 Igor Mammedov 2018-07-27 14:49:09 UTC
From my experiments hotplug fails in cases:
1: there is no coldplugged dimm in the last numa node
2: if order of dimms on CLI is:
         1st plugged dimm in node1
         2nd plugged dimm in node0

work in 2.10, broken since 2.12 by commit 
848a1cc1e (hw/acpi-build: build SRAT memory affinity structures for DIMM devices)

Comment 5 Miroslav Rezanina 2018-08-10 10:51:32 UTC
Fix included in qemu-kvm-rhev-2.12.0-10.el7

Comment 7 Yumei Huang 2018-08-14 05:33:44 UTC
Verify:
qemu-kvm-rhev-2.12.0-10.el7
kernel-3.10.0-931.el7.x86_64

Guest: win2008sp2/win2008r2/win2012/win2012r2/win2016

Boot guest with 1/2/3 nodes, coldplug and hotplug dimms, covering following scenarios, guest work well and all dimms are plugged successfully.  

1. one numa node, coldplug one dimm, hotplug one dimm

2. two numa nodes
----------------------------------------
|     |     coldplug    |    hotplug   |
|     |   dimm1  dimm2  | dimm3  dimm4 |
|node |     0      0    |   1      0   |
|node |     0      1    |   0      1   |
|node |     1      0    |   0      1   |
|node |     1      1    |   0      1   |
----------------------------------------

3. three numa nodes
a)
----------------------------------------------
|     |     coldplug    |       hotplug      |
|     |   dimm1  dimm2  | dimm3  dimm4  dimm5|
|node |     1      0    |   2      1      0  |
|node |     0      2    |   1      0      2  |
----------------------------------------------
b) 
-----------------------------------------------------
|     |         coldplug       |       hotplug      |
|     |   dimm1  dimm2  dimm3  | dimm4  dimm5  dimm6|
|node |     2      1       0   |  0      1       2  |
-----------------------------------------------------

Comment 8 Yumei Huang 2018-08-14 06:04:24 UTC
Hi Igor,
Could you pls help check if the test in comment 7 is sufficient? If so, I will create a test loop to cover above scenarios and run it regularly.

Comment 9 Igor Mammedov 2018-08-15 08:50:42 UTC
(In reply to Yumei Huang from comment #8)

assuming that dimm numbering corresponds to the order they are plugged in:

#2 misses case where coldplugged dimms are in node 0 and the first hotplugged one goes to node 0 as well

Comment 10 Yumei Huang 2018-08-16 07:49:56 UTC
Thanks Igor.

Test another two scenarios besides comment 7, all passed. 


Two numa nodes:
----------------------------------------
|     |     coldplug    |    hotplug   |
|     |   dimm1  dimm2  | dimm3  dimm4 |
|node |     0      0    |   0      1   |
|node |     1      1    |   1      0   |
----------------------------------------

Comment 11 errata-xmlrpc 2018-11-01 11:13:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3443


Note You need to log in before you can comment on or make changes to this bug.