Bug 1361058

Summary: Memory is bound to wrong host node after migration
Product: Red Hat Enterprise Linux 7 Reporter: Yumei Huang <yuhuang>
Component: numadAssignee: Jan Synacek <jsynacek>
Status: CLOSED DUPLICATE QA Contact: qe-baseos-daemons
Severity: medium Docs Contact:
Priority: unspecified    
Version: 7.3CC: amit.shah, chayang, dgilbert, dyuan, ehabkost, juzhang, knoel, lhuang, michen, quintela, qzhang, virt-maint, xuzhang, yuhuang
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-09-19 17:48:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Yumei Huang 2016-07-28 09:33:39 UTC
Description of problem:
Boot guest with two numa nodes and bind ram to different host nodes. Then do migration. After migration finished,  in the des host, the ram bound to wrong host nodes. 

Version-Release number of selected component (if applicable):
qemu-kvm-rhev-2.6.0-15.el7
kernel-3.10.0-478.el7.x86_64

How reproducible:
always

Steps to Reproduce:
1. Boot guest with two numa nodes, and bind ram to different host nodes.
# /usr/libexec/qemu-kvm -name rhel73-2 -m 2G,slots=240,maxmem=20G -smp 16 \

-realtime mlock=off -no-user-config -nodefaults \

-drive file=/home/guest/7.3guest-1.qcow2,if=none,id=drive-disk,format=qcow2,cache=none -device virtio-scsi-pci,id=scsi0 -device scsi-hd,drive=drive-disk,bus=scsi0.0,id=scsi-hd0 \

-netdev tap,id=hostnet1 -device virtio-net-pci,mac=42:ce:a9:d2:4d:d9,id=idlbq7eA,netdev=hostnet1 -usb -device usb-tablet,id=input0 -vga qxl -spice port=5902,addr=0.0.0.0,disable-ticketing,image-compression=off,seamless-migration=on \

-object memory-backend-ram,size=1G,id=mem0,policy=bind,host-nodes=0  -numa node,nodeid=0,memdev=mem0   \

-object memory-backend-ram,size=1G,id=mem1,policy=bind,host-nodes=1   -numa node,nodeid=1,memdev=mem1  -monitor stdio

2. check smaps and numa_maps in src host 

3. Boot guest in dst host with same cmdline but with "-incoming tcp:0:5555"

4. Do migration in src host
(qemu) migrate -d tcp:xxxx:5555

5. after migration finished, check smaps and numa_maps in dst host. 

Actual results:
step2,  in src host, the memory is bound to right nodes. 
# grep -2 `expr 1024 \* 1024 ` /proc/`pgrep qemu`/smaps
VmFlags: rd wr mr mw me ac sd 
7f6190a00000-7f61d0a00000 rw-p 00000000 00:00 0 
Size:            1048576 kB
Rss:              256000 kB
Pss:              256000 kB
--
VmFlags: mr mw me sd 
7f61d0c00000-7f6210c00000 rw-p 00000000 00:00 0 
Size:            1048576 kB
Rss:              210944 kB
Pss:              210944 kB

# grep  7f6190a00000 /proc/`pgrep qemu`/numa_maps
7f6190a00000 bind:1 anon=72192 dirty=72192 N1=72192 kernelpagesize_kB=4

# grep  7f61d0c00000  /proc/`pgrep qemu`/numa_maps
7f61d0c00000 bind:0 anon=103424 dirty=103424 N0=103424 kernelpagesize_kB=4


step5, in dst host, the memory is bound to wrong nodes. 
# pgrep qemu
57777

#  grep -2 `expr 1024 \* 1024 ` /proc/57777/smaps
VmFlags: rd wr mr mw me ac sd 
7ff4a4600000-7ff4e4600000 rw-p 00000000 00:00 0 
Size:            1048576 kB
Rss:              550912 kB
Pss:              550912 kB
--
VmFlags: mr mw me sd 
7ff4e4800000-7ff524800000 rw-p 00000000 00:00 0 
Size:            1048576 kB
Rss:              520192 kB
Pss:              520192 kB

# grep 7ff4a4600000 /proc/57777/numa_maps 
7ff4a4600000 bind:0 anon=137728 dirty=137728 N0=137728 kernelpagesize_kB=4

# grep 7ff4e4800000 /proc/57777/numa_maps 
7ff4e4800000 bind:0 anon=130048 dirty=130048 N0=130048 kernelpagesize_kB=4


Expected results:
After migration, the memory should be bound to right host nodes.

Additional info:

Comment 2 Amit Shah 2016-08-03 08:19:14 UTC
Can you check the binding on the dest before and after migration?

Also, can you give me access to the hosts?

Thanks,

Comment 4 Amit Shah 2016-08-23 20:42:34 UTC
Is there a way we can get numad to not rebalance tasks that have some binding done?

Alternatively, a note in the docs mentioning not to mix numad and manual pinning?

Comment 5 Eduardo Habkost 2016-09-19 17:48:20 UTC
(In reply to Amit Shah from comment #4)
> Is there a way we can get numad to not rebalance tasks that have some
> binding done?
> 
> Alternatively, a note in the docs mentioning not to mix numad and manual
> pinning?

There is a Doc Text suggestion at bug 1360584.

Comment 6 Eduardo Habkost 2016-09-19 17:48:40 UTC

*** This bug has been marked as a duplicate of bug 1360584 ***