Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1153590 - Improve error message on huge page preallocation
Improve error message on huge page preallocation
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev (Show other bugs)
7.1
Unspecified Unspecified
unspecified Severity unspecified
: rc
: ---
Assigned To: Luiz Capitulino
Virtualization Bugs
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2014-10-16 05:25 EDT by Michal Privoznik
Modified: 2015-03-05 04:56 EST (History)
5 users (show)

See Also:
Fixed In Version: qemu-kvm-rhev-2.1.2-7.el7
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-03-05 04:56:45 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2015:0624 normal SHIPPED_LIVE Important: qemu-kvm-rhev security, bug fix, and enhancement update 2015-03-05 09:37:36 EST

  None (edit)
Description Michal Privoznik 2014-10-16 05:25:13 EDT
Description of problem:
When there's not enough huge pages in the system, and prealloc was requested on the command line, qemu will fail with this not so helpful message.


Version-Release number of selected component (if applicable):
qemu-kvm-rhev-2.1.2-3.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Allocate huge pages pool
2. Run qemu with huge pages memory backing that requires more pages than in the pool
3. Observe error

Actual results:
> # virsh start migt10
> error: Failed to start domain migt10
> error: internal error: process exited while connecting to monitor: os_mem_prealloc: failed to preallocate pages


Expected results:
Something like "not enough pages in the pool" or similar.

Additional info:
This can be reproduced via libvirt too. Just define a domain like this:

<domain type='kvm' id='2'>
  <name>migt10</name>
  <uuid>9dd81882-178b-4df0-8dae-ab864a079a4f</uuid>
  <memory unit='KiB'>1048576</memory>
  <currentMemory unit='KiB'>1048576</currentMemory>
  <memoryBacking>
    <hugepages>
      <page size='2048' unit='KiB' nodeset='0-3'/>
    </hugepages>
  </memoryBacking>
  <vcpu placement='static'>4</vcpu>
  <numatune>
    <memory mode='strict' nodeset='0-3'/>
    <memnode cellid='0' mode='strict' nodeset='0'/>
    <memnode cellid='1' mode='strict' nodeset='1'/>
    <memnode cellid='2' mode='strict' nodeset='2'/>
    <memnode cellid='3' mode='strict' nodeset='3'/>
  </numatune>
  <resource>
    <partition>/machine</partition>
  </resource>
  <os>
    <type arch='x86_64' machine='pc-i440fx-rhel7.1.0'>hvm</type>
    <boot dev='hd'/>
  </os>
  <features>
    <acpi/>
  </features>
  <cpu>
    <numa>
      <cell id='0' cpus='0' memory='262144'/>
      <cell id='1' cpus='1' memory='262144'/>
      <cell id='2' cpus='2' memory='262144'/>
      <cell id='3' cpus='3' memory='262144'/>
    </numa>
  </cpu>
  ...
</domain>

Then allocate say 128 of 2M huge pages:

# echo 128 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages

and try to run the domain:

# virsh start migt10
error: Failed to start domain migt10
error: internal error: process exited while connecting to monitor: os_mem_prealloc: failed to preallocate pages


Interesting thing is, if I set the pool size to 127 I see completely different error (which might serve as example of good error message):

 virsh start migt10
error: Failed to start domain migt10
error: internal error: process exited while connecting to monitor: 2014-10-16T09:23:53.652848Z qemu-kvm: -object memory-backend-file,prealloc=yes,mem-path=/dev/hugepages/libvirt/qemu,size=256M,id=ram-node0,host-nodes=0,policy=bind: unable to map backing store for hugepages: Cannot allocate memory
2014-10-16T09:23:53.653500Z qemu-kvm: -object memory-backend-file,prealloc=yes,mem-path=/dev/hugepages/libvirt/qemu,size=256M,id=ram-node1,host-nodes=1,policy=bind: unable to map backing store for hugepages: Cannot allocate memory
2014-10-16T09:23:53.653664Z qemu-kvm: -object memory-backend-file,prealloc=yes,mem-path=/dev/hugepages/libvirt/qemu,size=256M,id=ram-node2,host-nodes=2,policy=bind: unable to map backing store for hugepages: Cannot allocate memory
2014-10-16T09:23:53.653823Z qemu-kvm: -object memory-backend-file,prealloc=yes,mem-path=/dev/hugepages/libvirt/qemu,size=256M,id=ram-node3,host-nodes=3,policy=bind: unable to map backing store for hugepages: Cannot allocate memory
Comment 2 Michal Privoznik 2014-10-16 09:06:30 EDT
Even better error message could be "Insufficient free host memory pages available to allocate guest RAM"
Comment 3 Michal Privoznik 2014-10-16 09:35:43 EDT
Patch proposed upstream:

https://lists.gnu.org/archive/html/qemu-devel/2014-10/msg01778.html
Comment 4 Luiz Capitulino 2014-10-21 12:01:56 EDT
Michal, I'll review your patch upstream shortly. If you would like to backport it to RHEL yourself, just re-assign this BZ to you.
Comment 5 Miroslav Rezanina 2014-11-06 13:33:11 EST
Fix included in qemu-kvm-rhev-2.1.2-7.el7
Comment 7 huiqingding 2014-11-17 01:45:59 EST
Reproduce this issue using the following version:
kernel-3.10.0-203.el7.x86_64
qemu-kvm-rhev-2.1.2-7.el7.x86_64

Steps to Reproduce:
1. the host support 1G hugepage
cat /proc/cmdline 
BOOT_IMAGE=/vmlinuz-3.10.0-203.el7.x86_64 root=/dev/mapper/rhel_intel--brickland--0100-root ro console=ttyS0,115200n81 crashkernel=auto rd.lvm.lv=rhel_intel-brickland-0100/root rd.lvm.lv=rhel_intel-brickland-0100/swap systemd.debug LANG=en_US.UTF-8 intel_iommu=on hugepagesz=1G default_hugepagesz=1G

2. assign one hugepage for each numa node
# echo 1 > /sys/devices/system/node/node0/hugepages/hugepages-1048576kB/nr_hugepages 
# echo 1 > /sys/devices/system/node/node1/hugepages/hugepages-1048576kB/nr_hugepages 
# echo 1 > /sys/devices/system/node/node2/hugepages/hugepages-1048576kB/nr_hugepages 
# echo 1 > /sys/devices/system/node/node3/hugepages/hugepages-1048576kB/nr_hugepages 

3. start a vm with 4 numa node and set 2G to the memory of each numa node
# virsh start vm1

ps: the xml file is as following:
<domain type='kvm'>
  <name>vm1</name>
  <uuid>3e89cb31-73e0-4e78-8616-513127d3a14f</uuid>
  <memory unit='KiB'>8388608</memory>
  <currentMemory unit='KiB'>8388608</currentMemory>
  <memoryBacking>
    <hugepages>
      <page size='1048576' unit='KiB' nodeset='0-3'/>
    </hugepages>
  </memoryBacking>
  <vcpu placement='static'>4</vcpu>
  <numatune>
    <memory mode='strict' nodeset='0-3'/>
    <memnode cellid='0' mode='strict' nodeset='0'/>
    <memnode cellid='1' mode='strict' nodeset='1'/>
    <memnode cellid='2' mode='strict' nodeset='2'/>
    <memnode cellid='3' mode='strict' nodeset='3'/>
  </numatune>
  <resource>
    <partition>/machine</partition>
  </resource>
  <os>
    <type arch='x86_64' machine='pc-i440fx-rhel7.0.0'>hvm</type>
    <boot dev='hd'/>
  </os>
  <features>
    <acpi/>
    <apic/>
    <pae/>
  </features>
  <cpu>
    <numa>
      <cell id='0' cpus='0' memory='2097152'/>
      <cell id='1' cpus='1' memory='2097152'/>
      <cell id='2' cpus='2' memory='2097152'/>
      <cell id='3' cpus='3' memory='2097152'/>
    </numa>
  </cpu>
... ...
</domain>

Results:
after step3, the following error info is outputed:
# virsh start vm1
error: Failed to start domain vm1
error: internal error: early end of file from monitor: possible problem:
os_mem_prealloc: failed to preallocate pages
Comment 8 huiqingding 2014-11-17 01:49:13 EST
Test this issue using the following version:
kernel-3.10.0-203.el7.x86_64
qemu-kvm-rhev-2.1.2-7.el7.x86_64

Use the same steps of comment 7, the result is that after step3, the error info is outputed:
# virsh start vm1
error: Failed to start domain vm1
error: internal error: early end of file from monitor: possible problem:
os_mem_prealloc: Insufficient free host memory pages available to allocate guest RAM


Based on the above result, I think this issue has been fixed.
Comment 11 errata-xmlrpc 2015-03-05 04:56:45 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-0624.html

Note You need to log in before you can comment on or make changes to this bug.