Hide Forgot
Description of problem: ======================= The effect of using this option with -mem-path and mounted hugetlbfs is directly opposite to its description in --help output, qemu upstream behaviour, and expectations of other software (e.g. libvirt). That is guest memory space is preallocated when this option is *not* specified, and vice versa. Version-Release number of selected component (if applicable): ============================================================= I used qemu-kvm-0.12.1.2-2.355.el6, but should be all RHEL6 qemu-kvm packages. How reproducible: ================= 100% Steps to Reproduce: =================== 1. pre-allocate huge pages through sysctl 2. mount hugetlbfs 3. run qemu-kvm with -mem-path and without -mem-prealloc under strace 4. confirm that mmap() system call for vm address space had MAP_POPULATE flag set 5. run qemu-kvm with -mem-path and with -mem-prealloc under strace 6. confirm that mmap() system call for vm address space had no MAP_POPULATE flag set Actual results: =============== qemu-kvm sets MAP_POPULATE flag when mapping address space for vm without -mem-prealloc option specified qemu-kvm doesn't set MAP_POPULATE flag when mapping address space for vm with -mem-prealloc option specified Expected results: ================= qemu-kvm should set MAP_POPULATE flag when mapping address space for vm with -mem-prealloc option specified qemu-kvm shoud not set MAP_POPULATE flag when mapping address space for vm without -mem-prealloc option specified Additional info: ================ In RHEL5.x kvm-*.src.rpm's [1] this option was added by the patch called kvm-kvm-qemu-preallocate-mem-path-memory-by-default.patch Its behaviour, although a bit non-intuitive, was clearly documented. Default value is ON, cmdline option flips this value: > +#ifdef MAP_POPULATE > +int mem_prealloc = 1; /* force preallocation of physical target memory */ > +#endif > > ... > > +#ifdef MAP_POPULATE > + "-mem-prealloc toggles preallocation of -mem-path backed physical memory\n" > + " at startup. Default is enabled.\n" > +#endif > > ... > > +#ifdef MAP_POPULATE > + case QEMU_OPTION_mem_prealloc: > + mem_prealloc = !mem_prealloc; > + break; > +#endif All this perfectly describes its current (RHEL6.x) behaviour. The problem is that it's not properly documented anymore. The --help says: -mem-prealloc preallocate guest memory (use with -mempath) Upstream qemu has this option included [2] with default value OFF, and explicit switch to ON if it's on the command line: > +#ifdef MAP_POPULATE > +int mem_prealloc = 0; /* force preallocation of physical target memory */ > +#endif > > ... > > +#ifdef MAP_POPULATE > + case QEMU_OPTION_mem_prealloc: > + mem_prealloc = 1; > + break; > +#endif Even libvirt expects preallocation to happen when this option is specified on the qemu's cmdline [3]. It seems like 3 years ago some confused users had already been burned by this option's rather surprising description/behaviour (see bug #582282), but somehow the problem had not been traced and fixed at the time. Although, I believe, in RHEL 5.x it still should have had correct description in --help output ;). --- [1] RHEL 5 Server SRPMS repository http://ftp.redhat.com/pub/redhat/linux/enterprise/5Server/en/os/SRPMS/ [2] QEMU upstream commitdiff > http://git.qemu.org/?p=qemu.git;a=commitdiff;h=c902760fb25f9c490af01e8f6bccaa8dd71cc224;hp=60e4c6317b8773d987729401aeca9d8c6b61b05f [3] Libvirt support commitdiff > http://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=edcae5a7c4ce05dacd4e475e2b94cc65bb2beff8;hp=98ea78b6ee04733ca69c67df5a196c33dd4829ae
Volodymyr, thanks for taking the time to enter a bug report with us and for all the references and patches. We appreciate the feedback and look to use reports such as this to guide our efforts at improving our products. That being said, we're not able to guarantee the timeliness or suitability of a resolution for issues entered here because this is not a mechanism for requesting support. If this issue is critical or in any way time sensitive, please raise a ticket through your regular Red Hat support channels to make certain it receives the proper attention and prioritization to assure a timely resolution. For information on how to contact the Red Hat production support team, please visit: https://www.redhat.com/support/process/production/#howto
If I understand correctly all that is required is to initialize mem_prealloc = 0. Upstream qemu doesn't use MAP_POPULATE anymore, but it preallocates only if -mem-prealloc is passed. vl.c:int mem_prealloc = 0; /* force preallocation of physical target memory */ exec.c: if (mem_prealloc) { exec.c: os_mem_prealloc(fd, area, memory); vl.c: case QEMU_OPTION_mem_prealloc: vl.c: mem_prealloc = 1; (they forgot to update the comment) I attached a patch that should fix the problem, can you test it? Otherwise I can provide an RPM next Monday (holyday here tomorrow).
Created attachment 926919 [details] fix -mem-prealloc to match upstream semantics
Fix included in qemu-kvm-0.12.1.2-2.442.el6
Verified the bug on qemu-kvm-0.12.1.2-2.442.el6: 0) On host: # mount -t hugetlbfs none /mnt/kvm_hugepage # echo 2048 > /proc/sys/vm/nr_hugepages # cat /proc/meminfo | grep -i hugepage HugePages_Total: 2048 HugePages_Free: 2048 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB 1) Boot up a guest with "-mem-prealloc" /usr/libexec/qemu-kvm -M rhel6.6.0 -cpu SandyBridge -mem-prealloc -mem-path /mnt/kvm_hugepage -m 2G -smp 2,sockets=1,cores=2,threads=1,maxcpus=16 -enable-kvm -name rhel6.6 -uuid 990ea161-6b67-47b2-b803-19fb01d30d12 -smbios type=1,manufacturer='Red Hat',product='RHEV Hypervisor',version=el6,serial=koTUXQrb,uuid=feebc8fd-f8b0-4e75-abc3-e63fcdb67170 -k en-us -rtc base=localtime,clock=host,driftfix=slew -nodefaults -monitor stdio -qmp tcp:0:6666,server,nowait -boot menu=on -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -monitor unix:/tmp/monitor-unix,nowait,server -drive file=/root/RHEL-Server-6.6-64-virtio.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none,werror=stop,rerror=stop,aio=threads -device virtio-blk-pci,scsi=off,bus=pci.0,drive=drive-virtio-disk0,id=virtio-disk0 -netdev tap,id=hostnet0,vhost=on -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:1a:4a:2e:28:1c,bus=pci.0,addr=0x4 -vga std -vnc :10 -usb -device usb-tablet Check on host: # cat /proc/meminfo | grep -i hugepage HugePages_Total: 2048 HugePages_Free: 1016 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB 2) Shutdown guest and boot up guest without "-mem-prealloc": (Similar CLI as above, just remove the -mem-prealloc parameter) Check on host: # cat /proc/meminfo | grep -i hugepage HugePages_Total: 2048 HugePages_Free: 2045 HugePages_Rsvd: 1029 HugePages_Surp: 0 Hugepagesize: 2048 kB Based on above, the bug is fixed.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2014-1490.html