Bug 1211918

Summary: when boot a guest with mlock=on,the nr_mlock of /proc/vmstat in host is not the same as memory size specified in cmd
Product: Red Hat Enterprise Linux 7 Reporter: Yanhui Ma <yama>
Component: qemu-kvm-rhevAssignee: Luiz Capitulino <lcapitulino>
Status: CLOSED NOTABUG QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.2CC: hhuang, juzhang, mazhang, mdeng, michen, virt-maint, xfu, ypu, yuhuang
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-09-24 18:01:09 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Yanhui Ma 2015-04-15 08:47:49 UTC
Description of problem:

when boot a guest with mlock=on,the nr_mlock of /proc/vmstat in host is not the same as memory size specified in cmd, it changes irregularly.

Version-Release number of selected component (if applicable):

kernel:
3.10.0-237.el7.x86_64
qemu-kvm:
qemu-kvm-rhev-2.2.0-8.el7.x86_64


How reproducible:

100%

Steps to Reproduce:
1.Check the mlock related pages in host before boot a guest.

# grep nr_mlock /proc/vmstat 
# grep nr_unevictable /proc/vmstat

2.boot a guest with mlock=on
cmd:

/usr/libexec/qemu-kvm -machine q35,accel=kvm -name rhel7.2 \

-m 1536 -cpu Opteron_G3,check -smp 4,sockets=2,cores=2,threads=1,maxcpus=32 \

-realtime mlock=on -object memory-backend-ram,host-nodes=0,policy=bind,id=mem-0,size=512M,prealloc=yes -numa node,memdev=mem-0 \

-object memory-backend-ram,host-nodes=1,policy=bind,id=mem-1,size=1024M,prealloc=yes -numa node,memdev=mem-1 \

 -uuid 49a3438a-70a3-4ba8-92ce-3a05e0934608 -nodefaults -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot order=c,menu=on,strict=on \

-drive file=/media/RHEL-server-7.2-64.qcow2,if=none,id=drive-data-disk1,cache=writethrough,format=qcow2,aio=native,werror=stop,rerror=stop -device virtio-scsi-pci,id=scsi1,addr=0x13 -device scsi-hd,drive=drive-data-disk1,bus=scsi1.0,id=data-disk1,bootindex=0 \

-netdev tap,id=hostnet0,vhost=on,id=hostnet0,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,id=virtio-net-pci0,mac=de:e0:7a:d2:b4:56,bus=pcie.0,addr=0x14 \

-device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vgamem_mb=8,bus=pcie.0,addr=0x11 -spice port=5000,password=redhat,seamless-migration=on,agent-mouse=off -usb -device usb-tablet,id=input0 -monitor stdio

3.Check the mlock related pages in host again



Actual results:
after step 1:

# grep nr_mlock /proc/vmstat
nr_mlock 0
# grep nr_unevictable /proc/vmstat
nr_unevictable 0

after step 3:

The nr_mlock (the nr_mlock is equal to the nr_unevictable) changes from 331527 (< 1536M  ) to 472146 (> 1536M) slowly.
When comliping kernel in guest, the nr_mlock in host reduces to 3584 quickly, then changes from 3584 to about 472146.


Expected results:
after step 3:
The nr_mlock should is 393216 (1536M*1024/4).  It should not have a big fluctuate. 

Additional info:
The memory page size is 4K.

Comment 2 Luiz Capitulino 2015-09-24 18:01:09 UTC
This is not a bug.

nr_mlock changing dynamically is OK because it also accounts for QEMU's stack and dynamic allocated memory, which grows and shrinks all the time. Also, it accounts for other processes as well that may be created/destroyed while you run the test.

If you're checking nr_mlock as part of a test-case verification, you have to check for the minimum value it should have. This minimum value (if discarding QEMU's own memory, which will change overtime) is:

(guest-memory * 1024 * 1024) / 4096

In your case this gives 393216, which looks correct. I've also tested this with latest kernel and qemu-kvm-rhev, and it works as expected.

PS: While testing this I found out that hugetlb pages are not accounted for, but we already have bug 1221443 for this. I'll add my findings there.