This service will be undergoing maintenance at 00:00 UTC, 2017-10-23 It is expected to last about 30 minutes
Bug 1006027 - -mem-prealloc option behaviour is opposite to expected
-mem-prealloc option behaviour is opposite to expected
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: qemu-kvm (Show other bugs)
6.5
x86_64 Linux
unspecified Severity low
: rc
: ---
Assigned To: Andrea Arcangeli
Virtualization Bugs
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-09-09 15:54 EDT by Volodymyr G. Lukiianyk
Modified: 2014-10-14 02:50 EDT (History)
13 users (show)

See Also:
Fixed In Version: qemu-kvm-0.12.1.2-2.442.el6
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-10-14 02:50:44 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
fix -mem-prealloc to match upstream semantics (1.01 KB, patch)
2014-08-14 17:40 EDT, Andrea Arcangeli
no flags Details | Diff

  None (edit)
Description Volodymyr G. Lukiianyk 2013-09-09 15:54:20 EDT
Description of problem:
=======================

The effect of using this option with -mem-path and mounted hugetlbfs is directly opposite to its description in --help output, qemu upstream behaviour, and expectations of other software (e.g. libvirt). That is guest memory space is preallocated when this option is *not* specified, and vice versa.


Version-Release number of selected component (if applicable):
=============================================================
    I used qemu-kvm-0.12.1.2-2.355.el6, but should be all
    RHEL6 qemu-kvm packages.


How reproducible:
=================
    100%


Steps to Reproduce:
===================
1. pre-allocate huge pages through sysctl
2. mount hugetlbfs

3. run qemu-kvm with -mem-path and without -mem-prealloc under strace
4. confirm that mmap() system call for vm address space had MAP_POPULATE
   flag set

5. run qemu-kvm with -mem-path and with -mem-prealloc under strace
6. confirm that mmap() system call for vm address space had no MAP_POPULATE
   flag set


Actual results:
===============
qemu-kvm sets MAP_POPULATE flag when mapping address space for vm without -mem-prealloc option specified

qemu-kvm doesn't set MAP_POPULATE flag when mapping address space for vm with -mem-prealloc option specified


Expected results:
=================
qemu-kvm should set MAP_POPULATE flag when mapping address space for vm with -mem-prealloc option specified

qemu-kvm shoud not set MAP_POPULATE flag when mapping address space for vm without -mem-prealloc option specified



Additional info:
================

In RHEL5.x kvm-*.src.rpm's [1] this option was added by the patch called

    kvm-kvm-qemu-preallocate-mem-path-memory-by-default.patch

Its behaviour, although a bit non-intuitive, was clearly documented. Default value is ON, cmdline option flips this value:

>  +#ifdef MAP_POPULATE
>  +int mem_prealloc = 1;  /* force preallocation of physical target memory */
>  +#endif
> 
>  ...
> 
>  +#ifdef MAP_POPULATE
>  +           "-mem-prealloc   toggles preallocation of -mem-path backed physical memory\n"
>  +           "                at startup.  Default is enabled.\n"
>  +#endif
> 
>  ...
> 
>  +#ifdef MAP_POPULATE
>  +            case QEMU_OPTION_mem_prealloc:
>  +               mem_prealloc = !mem_prealloc;
>  +               break;
>  +#endif

All this perfectly describes its current (RHEL6.x) behaviour. The problem is that it's not properly documented anymore. The --help says:

    -mem-prealloc        preallocate guest memory (use with -mempath)

Upstream qemu has this option included [2] with default value OFF, and explicit switch to ON if it's on the command line:

>  +#ifdef MAP_POPULATE
>  +int mem_prealloc = 0; /* force preallocation of physical target memory */
>  +#endif
> 
>  ...
> 
>  +#ifdef MAP_POPULATE
>  +            case QEMU_OPTION_mem_prealloc:
>  +                mem_prealloc = 1;
>  +                break;
>  +#endif

Even libvirt expects preallocation to happen when this option is specified on the qemu's cmdline [3].

It seems like 3 years ago some confused users had already been burned by this option's rather surprising description/behaviour (see bug #582282), but somehow the problem had not been traced and fixed at the time. Although, I believe, in RHEL 5.x it still should have had correct description in --help output ;).

---

[1] RHEL 5 Server SRPMS repository
    http://ftp.redhat.com/pub/redhat/linux/enterprise/5Server/en/os/SRPMS/

[2] QEMU upstream commitdiff
> http://git.qemu.org/?p=qemu.git;a=commitdiff;h=c902760fb25f9c490af01e8f6bccaa8dd71cc224;hp=60e4c6317b8773d987729401aeca9d8c6b61b05f

[3] Libvirt support commitdiff
> http://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=edcae5a7c4ce05dacd4e475e2b94cc65bb2beff8;hp=98ea78b6ee04733ca69c67df5a196c33dd4829ae
Comment 2 Ademar Reis 2013-09-16 15:25:51 EDT
Volodymyr, thanks for taking the time to enter a bug report with us and for all the references and patches. We appreciate the feedback and look to use reports such as this to guide our efforts at improving our products.

That being said, we're not able to guarantee the timeliness or suitability of a resolution for issues entered here because this is not a mechanism for requesting support.

If this issue is critical or in any way time sensitive, please raise a ticket
through your regular Red Hat support channels to make certain  it receives the
proper attention and prioritization to assure a timely resolution.

For information on how to contact the Red Hat production support team, please
visit: https://www.redhat.com/support/process/production/#howto
Comment 4 Andrea Arcangeli 2014-08-14 17:38:01 EDT
If I understand correctly all that is required is to initialize mem_prealloc = 0.

Upstream qemu doesn't use MAP_POPULATE anymore, but it preallocates only if -mem-prealloc is passed.

vl.c:int mem_prealloc = 0; /* force preallocation of physical target memory */
exec.c:    if (mem_prealloc) {
exec.c:        os_mem_prealloc(fd, area, memory);
vl.c:            case QEMU_OPTION_mem_prealloc:
vl.c:                mem_prealloc = 1;

(they forgot to update the comment)

I attached a patch that should fix the problem, can you test it? Otherwise I can provide an RPM next Monday (holyday here tomorrow).
Comment 5 Andrea Arcangeli 2014-08-14 17:40:16 EDT
Created attachment 926919 [details]
fix -mem-prealloc to match upstream semantics
Comment 10 Jeff Nelson 2014-08-28 23:28:32 EDT
Fix included in qemu-kvm-0.12.1.2-2.442.el6
Comment 12 Qunfang Zhang 2014-09-01 03:36:46 EDT
Verified the bug on qemu-kvm-0.12.1.2-2.442.el6:

0) On host:

# mount -t hugetlbfs none /mnt/kvm_hugepage 
# echo 2048 > /proc/sys/vm/nr_hugepages
# cat /proc/meminfo | grep -i hugepage 

HugePages_Total:    2048
HugePages_Free:     2048
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB



1) Boot up a guest with "-mem-prealloc"

/usr/libexec/qemu-kvm -M rhel6.6.0 -cpu SandyBridge -mem-prealloc -mem-path /mnt/kvm_hugepage  -m 2G -smp 2,sockets=1,cores=2,threads=1,maxcpus=16 -enable-kvm -name rhel6.6 -uuid 990ea161-6b67-47b2-b803-19fb01d30d12 -smbios type=1,manufacturer='Red Hat',product='RHEV Hypervisor',version=el6,serial=koTUXQrb,uuid=feebc8fd-f8b0-4e75-abc3-e63fcdb67170 -k en-us -rtc base=localtime,clock=host,driftfix=slew -nodefaults -monitor stdio -qmp tcp:0:6666,server,nowait -boot menu=on  -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -monitor unix:/tmp/monitor-unix,nowait,server -drive file=/root/RHEL-Server-6.6-64-virtio.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none,werror=stop,rerror=stop,aio=threads -device virtio-blk-pci,scsi=off,bus=pci.0,drive=drive-virtio-disk0,id=virtio-disk0 -netdev tap,id=hostnet0,vhost=on -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:1a:4a:2e:28:1c,bus=pci.0,addr=0x4 -vga std -vnc :10 -usb -device usb-tablet

Check on host:

# cat /proc/meminfo | grep -i hugepage 

HugePages_Total:    2048
HugePages_Free:     1016
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB

2) Shutdown guest and boot up guest without "-mem-prealloc": (Similar CLI as above, just remove the -mem-prealloc parameter)

Check on host: 

# cat /proc/meminfo | grep -i hugepage 

HugePages_Total:    2048
HugePages_Free:     2045
HugePages_Rsvd:     1029
HugePages_Surp:        0
Hugepagesize:       2048 kB

Based on above, the bug is fixed.
Comment 14 errata-xmlrpc 2014-10-14 02:50:44 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-1490.html

Note You need to log in before you can comment on or make changes to this bug.