Bug 1135893

Summary: qemu-kvm should report an error message when host's freehugepage memory < domain's memory
Product: Red Hat Enterprise Linux 7 Reporter: Hu Jianwei <jiahu>
Component: qemu-kvm-rhevAssignee: Luiz Capitulino <lcapitulino>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.0CC: ajia, coli, dyuan, hhuang, honzhang, huding, jiahu, juli, juzhang, lcapitulino, mrezanin, mzhan, virt-maint, xfu
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: qemu-kvm-rhev-2.1.0-5.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-03-05 09:54:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Hu Jianwei 2014-09-01 06:29:55 UTC
Description of problem:
qemu-kvm should report an error message when host's freehugepage memory < domain's memory

Version-Release number of selected component (if applicable):
libvirt-1.2.7-2.el7.x86_64
qemu-kvm-rhev-2.1.0-3.el7.x86_64
kernel-3.10.0-138.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
[root@localhost libvirt-1.2.7]# sysctl -a |grep vm.nr_hugepages
vm.nr_hugepages = 512
vm.nr_hugepages_mempolicy = 512
[root@localhost libvirt-1.2.7]# cat /proc/meminfo | grep -i huge
AnonHugePages:     20480 kB
HugePages_Total:     512
HugePages_Free:      512
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB

[root@localhost libvirt-1.2.7]# virsh dumpxml r7 | grep huge -b6
0-<domain type='kvm'>
20-  <name>r7</name>
38-  <uuid>628e1918-eb89-4ded-8c10-f81e93b8eb7c</uuid>
90-  <memory unit='KiB'>2097152</memory>
128-  <currentMemory unit='KiB'>2097152</currentMemory>
180-  <memoryBacking>
198:    <hugepages/>
215-  </memoryBacking>
234-  <vcpu placement='static'>1</vcpu>
270-  <os>
277-    <type arch='x86_64' machine='pc-i440fx-rhel7.0.0'>hvm</type>
342-    <boot dev='hd'/>
363-  </os>

[root@localhost ~]# virsh start r7
error: Failed to start domain r7
error: internal error: process exited while connecting to monitor: 

[root@localhost ~]# strace /usr/libexec/qemu-kvm -name r7 -S -machine pc-i440fx-rhel7.0.0,accel=kvm,usb=off -m 2048 -mem-prealloc -mem-path /dev/hugepages/libvirt/qemu -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid 628e1918-eb89-4ded-8c10-f81e93b8eb7c -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/r7.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/var/lib/libvirt/images/r7_latest.img,if=none,id=drive-ide0-0-0,format=raw,cache=none -device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -vnc 127.0.0.1:0 -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,bus=pci.0,addr=0x2 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 -msg timestamp=on

...clipped...

rt_sigprocmask(SIG_SETMASK, ~[RTMIN RT_1], [BUS USR1 ALRM IO], 8) = 0
mmap(NULL, 8392704, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x7f25f3966000
mprotect(0x7f25f3966000, 4096, PROT_NONE) = 0
clone(child_stack=0x7f25f4165cb0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x7f25f41669d0, tls=0x7f25f4166700, child_tidptr=0x7f25f41669d0) = 17564
rt_sigprocmask(SIG_SETMASK, [BUS USR1 ALRM IO], NULL, 8) = 0
futex(0x7f260167df60, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x7f260167dec4, FUTEX_WAIT_PRIVATE, 1, NULL) = 0
ioctl(8, KVM_CHECK_EXTENSION, 0x19)     = 1024
brk(0)                                  = 0x7f2601bcc000
brk(0x7f2601bef000)                     = 0x7f2601bef000
ioctl(8, KVM_CHECK_EXTENSION, 0x19)     = 1024
statfs("/dev/hugepages/libvirt/qemu", {f_type=0x958458f6, f_bsize=2097152, f_blocks=0, f_bfree=0, f_bavail=0, f_files=0, f_ffree=0, f_fsid={0, 0}, f_namelen=255, f_frsize=2097152}) = 0
ioctl(8, KVM_CHECK_EXTENSION, 0x10)     = 1
open("/dev/hugepages/libvirt/qemu/qemu_back_mem.pc.ram.plj0si", O_RDWR|O_CREAT|O_EXCL, 0600) = 12
unlink("/dev/hugepages/libvirt/qemu/qemu_back_mem.pc.ram.plj0si") = 0
ftruncate(12, 2147483648)               = 0
mmap(NULL, 2147483648, PROT_READ|PROT_WRITE, MAP_PRIVATE, 12, 0) = -1 ENOMEM (Cannot allocate memory)
close(12)                               = 0
exit_group(1)                           = ?
+++ exited with 1 +++

Actual results:
As shown above steps, libvirt can not capture the error message due to no any errors were printed by qemu-kvm-rhev package.

Expected results:
Return a clear error message.

Comment 3 Luiz Capitulino 2014-09-09 12:39:30 UTC
Patch posted upstream:

http://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg01558.html

Comment 4 FuXiangChun 2014-09-09 14:33:55 UTC
Luiz,
KVM QE want to try brew build as below, but it has been closed. Could you please provide it to QE again? Thanks.
https://brewweb.devel.redhat.com/taskinfo?taskID=7929807

Comment 6 FuXiangChun 2014-09-10 05:30:35 UTC
This bug can be reproduced with qemu-kvm-rhev-2.1.0-3.el7.x86_64 directly.

# sysctl -a |grep vm.nr_hugepages
vm.nr_hugepages = 2048
vm.nr_hugepages_mempolicy = 2048

# cat /proc/meminfo | grep -i huge
AnonHugePages:      4096 kB
HugePages_Total:    2048
HugePages_Free:     2048
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB

Boot qemu-kvm process with this cli.
/usr/libexec/qemu-kvm -M pc -m 5G -smp 4,maxcpus=160  -monitor stdio -mem-prealloc -mem-path /mnt/kvm_hugepage

result:
qemu-kvm quit. and don't print any message.


I tested several scenarios.
S1. -m 5G   HugePages_Free:2048    fail
S2. -m 4G   HugePages_Free:2048    works
S3  -m 2G   HugePages_Free:1024    works
S4  -m 2G   HugePages_Free:512     fail

BTW, seem when -m size >hugepages_Free about 2000 times. qemu-kvm will fail.

Comment 8 Luiz Capitulino 2014-09-10 20:10:24 UTC
Yes, whenever QEMU is assigned more hugepages than the host has available the problem will happen.

Comment 11 Miroslav Rezanina 2014-09-24 09:20:26 UTC
Fix included in qemu-kvm-rhev-2.1.0-5.el7

Comment 13 Jun Li 2014-09-29 05:01:17 UTC
Verify:

Version of components:
qemu-kvm-rhev-2.1.0-5.el7.x86_64
---
# mkdir /media/kvm_hugepage
# mount -t hugetlbfs none /media/kvm_hugepage
# echo 2048 > /proc/sys/vm/nr_hugepages 
# sysctl -a |grep vm.nr_hugepages
vm.nr_hugepages = 254
vm.nr_hugepages_mempolicy = 254
# cat /proc/meminfo | grep -i huge
AnonHugePages:     75776 kB
HugePages_Total:     254
HugePages_Free:      254
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
# /usr/libexec/qemu-kvm -M pc -m 5G -smp 4,maxcpus=160  -monitor stdio -mem-prealloc -mem-path /media/kvm_hugepage
QEMU 2.1.0 monitor - type 'help' for more information
(qemu) qemu-kvm: unable to map backing store for hugepages: Cannot allocate memory

Based on above testing, this bz has been verified.
==========================
Reproduce:

Version of components:
qemu-kvm-rhev-2.1.0-2.el7.x86_64
---
# sysctl -a |grep vm.nr_hugepages
vm.nr_hugepages = 254
vm.nr_hugepages_mempolicy = 254
# cat /proc/meminfo | grep -i huge
AnonHugePages:     75776 kB
HugePages_Total:     254
HugePages_Free:      254
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
# /usr/libexec/qemu-kvm -M pc -m 5G -smp 4,maxcpus=160  -monitor stdio -mem-prealloc -mem-path /media/kvm_hugepage
QEMU 2.1.0 monitor - type 'help' for more information
(qemu) 

Based on above testing, this bz has been reproduced.

Comment 16 errata-xmlrpc 2015-03-05 09:54:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-0624.html