1350735 – memory locking limit for regular users is too low to launch guests through libvirt

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1350735 - memory locking limit for regular users is too low to launch guests through libvirt

Summary: memory locking limit for regular users is too low to launch guests through li...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	qemu-kvm-rhev
Sub Component:
Version:	7.3
Hardware:	ppc64le
OS:	Linux
Priority:	unspecified
Severity:	unspecified
Target Milestone:	rc
Target Release:	---
Assignee:	David Gibson
QA Contact:	Virtualization Bugs
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2016-06-28 09:09 UTC by Andrea Bolognani
Modified:	2016-11-07 21:20 UTC (History)
CC List:	12 users (show)
Fixed In Version:	qemu-kvm-rhev-2.6.0-12.el7
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:	1293024
Environment:
Last Closed:	2016-11-07 21:20:24 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
qemu-kvm-rhev-2.6.0-9 libguestfs-test-tool log (57.40 KB, text/plain) 2016-07-12 09:51 UTC, Zhengtong	no flags	Details
qemu-kvm-rhev-2.6.0-12 libguestfs-test-tool log (60.05 KB, text/plain) 2016-07-12 09:51 UTC, Zhengtong	no flags	Details
qemu-kvm-rhev-2.6.0-9 libguestfs-test-tool log version 2 (failed) (8.38 KB, text/plain) 2016-07-13 05:34 UTC, Zhengtong	no flags	Details
guestfs log (2.68 KB, text/plain) 2016-07-14 05:58 UTC, Zhengtong	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2016:2673	0	normal	SHIPPED_LIVE	qemu-kvm-rhev bug fix and enhancement update	2016-11-08 01:06:13 UTC

Description Andrea Bolognani 2016-06-28 09:09:51 UTC

+++ This bug was initially created as a clone of Bug #1293024 +++

Description of problem:

When launching either a ppc64 or ppc64le guest (x86-64 host) I get:

ERROR    internal error: Process exited prior to exec: libvirt:  error : cannot limit locked memory to 46137344: Operation not permitted

Version-Release number of selected component (if applicable):

libvirt-1.3.0-1.fc24.x86_64
kernel 4.2.6-301.fc23.x86_64

How reproducible:

100%

Steps to Reproduce:
1. Run this virt-install command:

virt-install --name=tmp-fed0fb92 --ram=4096 --vcpus=1 --os-type=linux --os-variant=fedora21 --arch ppc64le --machine pseries --initrd-inject=/tmp/tmp.sVjN8w5nyk '--extra-args=ks=file:/tmp.sVjN8w5nyk console=tty0 console=hvc0 proxy=http://cache.home.annexia.org:3128' --disk fedora-23-ppc64le,size=6,format=raw --serial pty --location=https://download.fedoraproject.org/pub/fedora-secondary/releases/21/Server/ppc64le/os/ --nographics --noreboot

(The same failure happens with ppc64).

--- Additional comment from Richard W.M. Jones on 2015-12-19 04:56:29 EST ---

It's OK with an x86-64 guest.

--- Additional comment from Richard W.M. Jones on 2015-12-19 05:00:33 EST ---

I worked around it by increasing my user account's locked memory
limit (ulimit -l) to unlimited.  I wonder if the error message comes
from qemu?

--- Additional comment from Richard W.M. Jones on 2015-12-19 05:04:44 EST ---

Smallest reproducer is this command (NB: as NON-root):

$ virt-install --name=tmp-bz1293024 --ram=4096 --vcpus=1 --os-type=linux --os-variant=fedora22 --disk /var/tmp/fedora-23.img,size=6,format=raw --serial pty --location=https://download.fedoraproject.org/pub/fedora-secondary/releases/23/Server/ppc64le/os/ --nographics --noreboot --arch ppc64le

Note: If you are playing with ulimit, you have to kill libvirtd
since it could use the previous ulimit from another session.

--- Additional comment from Jan Kurik on 2016-02-24 09:09:40 EST ---

This bug appears to have been reported against 'rawhide' during the Fedora 24 development cycle.
Changing version to '24'.

More information and reason for this action is here:
https://fedoraproject.org/wiki/Fedora_Program_Management/HouseKeeping/Fedora24#Rawhide_Rebase

--- Additional comment from Cole Robinson on 2016-03-16 19:43:19 EDT ---

Rich do you still see this with latest rawhide?

(the mem locking error comes from libvirt... apparently ppc64 needs some explicit mem locking? that's what the code says, but I didn't dig deeper than that)

--- Additional comment from Richard W.M. Jones on 2016-03-17 12:11:30 EDT ---

There doesn't appear to be a Rawhide repo for ppc64le yet.

Unless something has changed in libvirt or virt-install to fix
this, I doubt very much that it is fixed.

--- Additional comment from Cole Robinson on 2016-03-17 12:13:51 EDT ---

Andrea, any thoughts on this? Have you seen this issue?

--- Additional comment from Richard W.M. Jones on 2016-03-24 14:46:38 EDT ---

Still happening on libvirt-1.3.2-3.fc24.x86_64 (x86-64 host, running
Ubuntu/ppc64le guest).

--- Additional comment from Andrea Bolognani on 2016-03-29 09:11:01 EDT ---

(In reply to Cole Robinson from comment #7)
> Andrea, any thoughts on this? Have you seen this issue?

I hadn't, thanks for bringing it up.

The issue Rich's seeing is caused by

  https://bugzilla.redhat.com/show_bug.cgi?id=1273480

having been fixed.

Short version is that ppc64 guests always need some amount
of memory to be locked, and that amount is guaranteed to be
more than the default 64 KiB allowance.

libvirt tries to raise the limit to prevent the allocation
from failing, but it can only do that successfully when
running as root.

--- Additional comment from Richard W.M. Jones on 2016-04-07 15:51:02 EDT ---

I set the architecture to ppc64le, but in fact it affects
ppc64 also.  In answer to comment 5, it affects Fedora 24 too.

--- Additional comment from Andrea Bolognani on 2016-04-08 04:55:42 EDT ---

(In reply to Richard W.M. Jones from comment #10)
> I set the architecture to ppc64le, but in fact it affects
> ppc64 also.  In answer to comment 5, it affects Fedora 24 too.

Yeah, this will affect both ppc64 variants and any version of
libvirt from 1.3.0 on.

Unfortunately I don't really see a way to fix this: the memory
locking limit really needs to be quite high on ppc64,
definitely higher than the default: the fact that this was not
enforced before was a bug and could lead to more trouble later
on.

When libvirtd is running as root we can adjust the limit
ourselves quite easily; when it's running as a regular user,
we're of course unable to do that.

At least the error message is IMHO quite clear and hints at
the solution.

--- Additional comment from Cole Robinson on 2016-04-26 17:42:04 EDT ---

bug 1273480 seems to be all about hostdev assignment, which rich isn't doing. I see this commit:

commit 16562bbc587add5a03a01c8eb8607c9e05819607
Author: Andrea Bolognani <abologna>
Date:   Fri Nov 13 10:58:07 2015 +0100

    qemu: Always set locked memory limit for ppc64 domains
    
    Unlike other architectures, ppc64 domains need to lock memory
    even when VFIO is not used.


But I don't see where the need for unconditional locked memory is explained... Can you point me to that discussion?

--- Additional comment from Andrea Bolognani on 2016-04-28 08:08:52 EDT ---

(In reply to Cole Robinson from comment #12)
> bug 1273480 seems to be all about hostdev assignment, which rich isn't
> doing. I see this commit:
> 
> commit 16562bbc587add5a03a01c8eb8607c9e05819607
> Author: Andrea Bolognani <abologna>
> Date:   Fri Nov 13 10:58:07 2015 +0100
> 
>     qemu: Always set locked memory limit for ppc64 domains
>     
>     Unlike other architectures, ppc64 domains need to lock memory
>     even when VFIO is not used.
> 
> 
> But I don't see where the need for unconditional locked memory is
> explained... Can you point me to that discussion?

See David's detailed explanation[1] from back when the patch
series was posted on libvir-list.

On a related note, there's been some progress recently toward
getting some of that memory actually accounted for.


[1] https://www.redhat.com/archives/libvir-list/2015-November/msg00769.html

--- Additional comment from Cole Robinson on 2016-04-29 08:00:32 EDT ---

Thanks for the pointer.  So if ppc64 doesn't do this memlocking, do things fail 100% of the time? Or is this a heuristic that maybe is triggering a false positive? Rich maybe you can edit libvirt and figure it out.

If this has the ponential to be wrong in the non-VFIO case, I suggest at least making it a non-fatal error if the daemon is unprivileged, and logging a VIR_WARN instead.

An additional bit we could do is have qemu-system-ppc64 ship a /etc/security/limits.d file to up the memlock limit on pcc64 hosts

--- Additional comment from Andrea Bolognani on 2016-05-05 04:51:40 EDT ---

(In reply to Cole Robinson from comment #14)
> Thanks for the pointer.  So if ppc64 doesn't do this memlocking, do things
> fail 100% of the time? Or is this a heuristic that maybe is triggering a
> false positive? Rich maybe you can edit libvirt and figure it out.
> 
> If this has the ponential to be wrong in the non-VFIO case, I suggest at
> least making it a non-fatal error if the daemon is unprivileged, and logging
> a VIR_WARN instead.
> 
> An additional bit we could do is have qemu-system-ppc64 ship a
> /etc/security/limits.d file to up the memlock limit on pcc64 hosts

My understanding is that the consequences of not raising the
memory locking limit appropriately can be pretty severe.

David, can you give us more details please? What could happen
if users ran QEMU with the default memory locking limit of
64 KiB?

--- Additional comment from David Gibson on 2016-05-26 02:08:22 EDT ---

Cole,

The key thing here is that on ppc64, unlike x86, the hardware page tables are encoded as a big hash table, rather than a set of radix trees.  Each guest needs its own hashed page table (HPT).  These can get quite large - it can vary depending on a number of things, but the usual rule of thumb is that the HPT is 1/128th to 1/64th of RAM size, with a minimum size of 16MiB.

For PAPR paravirtualized guests this HPT is accessed entirely via hypercall and does not exist within the guest's RAM - it needs to be allocated on the host above and beyond the guest's RAM image.  When using the "HV" KVM implementation (the only one we're targetting) the HPT has to be _host_ physically contiguous, unswappable memory (because it's read directly by hardware.

At the moment, the host kernel doesn't actually need the locked memory limit - it allows unprivileged users (with permission to create VMs) to allocate HPTs anyway, but this is really a bug.  As it stands a non-privileged user could create a whole pile of tiny VMs (it doesn't even need to actually execute any instructions in the VMs) and consume an unbounded amount of host memory with those 16MiB HPTs.

So we plan to fix that in the kernel.  In the meantime libvirt treats things as if the kernel enforced that limit even though it doesn't yet, to avoid having yet more ugly kernel version dependencies.


Andrea, would it make any sense to have failure of the setrlimit in libvirt cause only a warning, not a fatal error?  In that case it wouldn't prevent things working in situations where it can for other reasons (old kernel which doesn't enforce limits, PR KVM which doesn't require it..).

--- Additional comment from Peter Krempa on 2016-05-26 03:28:32 EDT ---

(In reply to David Gibson from comment #16)

[...]

> Andrea, would it make any sense to have failure of the setrlimit in libvirt
> cause only a warning, not a fatal error?  In that case it wouldn't prevent
> things working in situations where it can for other reasons (old kernel
> which doesn't enforce limits, PR KVM which doesn't require it..).

Not really. Warnings are not presented to the user just logged to the log file so its very likely to get ignored.

--- Additional comment from Andrea Bolognani on 2016-05-26 04:20:07 EDT ---

(In reply to David Gibson from comment #16)
> Cole,
> 
> The key thing here is that on ppc64, unlike x86, the hardware page tables
> are encoded as a big hash table, rather than a set of radix trees.  Each
> guest needs its own hashed page table (HPT).  These can get quite large - it
> can vary depending on a number of things, but the usual rule of thumb is
> that the HPT is 1/128th to 1/64th of RAM size, with a minimum size of 16MiB.
> 
> For PAPR paravirtualized guests this HPT is accessed entirely via hypercall
> and does not exist within the guest's RAM - it needs to be allocated on the
> host above and beyond the guest's RAM image.  When using the "HV" KVM
> implementation (the only one we're targetting) the HPT has to be _host_
> physically contiguous, unswappable memory (because it's read directly by
> hardware.
> 
> At the moment, the host kernel doesn't actually need the locked memory limit
> - it allows unprivileged users (with permission to create VMs) to allocate
> HPTs anyway, but this is really a bug.

So IIUC the bug is that, by not accounting for that memory
properly, the kernel is allowing it to be allocated as
potentially non-contiguous and swappable, which will result
in failure right away (non-contiguous) or as soon as it has
been swapped out (swappable). Is that right?

> As it stands a non-privileged user
> could create a whole pile of tiny VMs (it doesn't even need to actually
> execute any instructions in the VMs) and consume an unbounded amount of host
> memory with those 16MiB HPTs.

That's not really something QEMU specific, though, is it?
The same user could just as easily start a bunch of random
processes, each one allocating 16MiB+ and get the same result.

> So we plan to fix that in the kernel.  In the meantime libvirt treats things
> as if the kernel enforced that limit even though it doesn't yet, to avoid
> having yet more ugly kernel version dependencies.
> 
> 
> Andrea, would it make any sense to have failure of the setrlimit in libvirt
> cause only a warning, not a fatal error?  In that case it wouldn't prevent
> things working in situations where it can for other reasons (old kernel
> which doesn't enforce limits, PR KVM which doesn't require it..).

I don't think that's a good idea.

First of all, we'd have to be able to tell whether raising
the limit is actually needed or not, which would probably be
tricky - especially considering that libvirt currently doesn't
know anything about the difference between HV and PR KVM.

Most importantly, we'd be allowing users to start guests that
we know full well may run into trouble later. I'd rather error
out early than have the guest behave erratically down the line
for no apparent reason.

Peter's point about warnings having very little visibility is
also a good one.

--- Additional comment from David Gibson on 2016-05-26 18:11:08 EDT ---

> > At the moment, the host kernel doesn't actually need the locked memory limit
> > - it allows unprivileged users (with permission to create VMs) to allocate
> > HPTs anyway, but this is really a bug.

> So IIUC the bug is that, by not accounting for that memory
> properly, the kernel is allowing it to be allocated as
> potentially non-contiguous and swappable, which will result
> in failure right away (non-contiguous) or as soon as it has
> been swapped out (swappable). Is that right?

No.  The HPT *will* be allocated contiguous and non-swappable (it's allocated with CMA) - it's just not accounted against the process / user's locked memory limit.  That's why this is a security bug.

> > As it stands a non-privileged user
> > could create a whole pile of tiny VMs (it doesn't even need to actually
> > execute any instructions in the VMs) and consume an unbounded amount of host
> > memory with those 16MiB HPTs.

> That's not really something QEMU specific, though, is it?
> The same user could just as easily start a bunch of random
> processes, each one allocating 16MiB+ and get the same result.

No, because in that case the memory would be non-contiguous and swappable.

--- Additional comment from Andrea Bolognani on 2016-06-09 10:26:56 EDT ---

Got it.

So I guess our options are:

  a) Raise locked memory limit for users to something like
     64 MiB, so they can run guests of reasonable size (4 GiB)
     without running into errors. Appliances created by
     libguestfs are going to be even smaller than that, I
     assume, so they would work

  b) Teach libvirt about the difference between kvm_hv and
     kvm_pr, only try to tweak the locked memory limit when
     using HV, and have libguestfs always use PR

  c) Force libguestfs to use the direct backend on ppc64

  d) Leave things as they are, basically restricting
     libguestfs usage to the root user

a) and c) are definitely hacks, but could be implemented
fairly quickly and removed once a better solution is in
place.

b) looks like it would be the proper solution but, as with
all thing libvirt, rushing an implementation without thinking
hard at the design has the potential to paint us in a corner.

d) is probably not acceptable.

--- Additional comment from David Gibson on 2016-06-14 02:01:23 EDT ---

In the short term, I think we need to go with option (a).  That's the only really feasible way we can handle this in the next RHEL release, I think.

(b).. I really dislike.  We try to avoid explicitly exposing the PR/HV distinction even to qemu as much as possible - instead using explicit capabilities for various features.  Exposing and using that distinction a layer beyond qemu is going to open several new cans of worms.  For one thing, whether the kernel picks HV or PR can depend on a number of details of both host and guest configuration, so you can't really reliably know which one it's going to be before starting it.

(c) I'm not quite sure what "direct mode" entails.

(d) is.. yeah, certainly suboptimal.


Other things we could try:

(e) Change KVM so that if it's unable to allocate the HPT due to locked memory limit, it will fall back to PR-KVM.  In a sense that's the most pedantically correct, but I dislike it, because I suspect the result will be lots of people's VMs going slow for non-obvious reasons.

(f) Put something distinctive in the error qemu reports when it hits the HPT allocation problem, and only have libvirt try to alter the limit and retry if qemu dies with that error.  Involves an extra qemu invocation, which sucks.

(g) Introduce some new kind of "VM limits" stuff into RHEL startup scripts, that will adjust users locked memory limits based on some sort of # of VMs and max size of VMs values configured by admin.  This is basically a sophisticated version of (a).


Ugh.. none of these are great :/.

--- Additional comment from Andrea Bolognani on 2016-06-14 06:33:32 EDT ---

(In reply to David Gibson from comment #21)
> In the short term, I think we need to go with option (a).  That's the only
> really feasible way we can handle this in the next RHEL release, I think.

I guess we would have to make qemu-kvm-rhev ship a
/etc/security/limits.d/qemu-kvm-rhev-memlock.conf file that
sets the new limit. It wouldn't make sense to raise the
limit for hosts that are not going to act as hypervisors.

> (b).. I really dislike.  We try to avoid explicitly exposing the PR/HV
> distinction even to qemu as much as possible - instead using explicit
> capabilities for various features.  Exposing and using that distinction a
> layer beyond qemu is going to open several new cans of worms.  For one
> thing, whether the kernel picks HV or PR can depend on a number of details
> of both host and guest configuration, so you can't really reliably know
> which one it's going to be before starting it.

Okay then.

> (c) I'm not quite sure what "direct mode" entails.

Basically libguestfs will call QEMU itself instead of going
through libvirt. guestfish will give you this hint:

  libguestfs: error: could not create appliance through libvirt.

  Try running qemu directly without libvirt using this environment variable:
  export LIBGUESTFS_BACKEND=direct

and if you do that you'll of course be able to avoid the error
raised by libvirt.

I don't know what other implications there are to using the
direct backend, though. Rich?

> (d) is.. yeah, certainly suboptimal.
> 
> 
> Other things we could try:
> 
> (e) Change KVM so that if it's unable to allocate the HPT due to locked
> memory limit, it will fall back to PR-KVM.  In a sense that's the most
> pedantically correct, but I dislike it, because I suspect the result will be
> lots of people's VMs going slow for non-obvious reasons.

Yeah, doing this kind of stuff outside of user's control is
never going to end well. Better to fail with a clear error
message than trying to patch things up behind the scenes.

> (f) Put something distinctive in the error qemu reports when it hits the HPT
> allocation problem, and only have libvirt try to alter the limit and retry
> if qemu dies with that error.  Involves an extra qemu invocation, which
> sucks.

libvirt is not really designed in a way that allows you to
just try calling QEMU with some arguments and, if that fails,
call it again with different arguments. So QEMU would have to
expose the information through QMP somehow, for libvirt to
probe beforehand. I'm not sure whether this approach would
even be feasible.

> (g) Introduce some new kind of "VM limits" stuff into RHEL startup scripts,
> that will adjust users locked memory limits based on some sort of # of VMs
> and max size of VMs values configured by admin.  This is basically a
> sophisticated version of (a).

The limits are be per-process, though. So the only thing
that really matters is how much memory you want to allow
for an unpriviledged guest. PCI passthrough is not going
to be a factor unless you're root, and in that case you
can set the limit as you please.

> Ugh.. none of these are great :/.

--- Additional comment from Daniel Berrange on 2016-06-14 06:40:48 EDT ---

(In reply to Andrea Bolognani from comment #22)
> (In reply to David Gibson from comment #21)
> > In the short term, I think we need to go with option (a).  That's the only
> > really feasible way we can handle this in the next RHEL release, I think.
> 
> I guess we would have to make qemu-kvm-rhev ship a
> /etc/security/limits.d/qemu-kvm-rhev-memlock.conf file that
> sets the new limit. It wouldn't make sense to raise the
> limit for hosts that are not going to act as hypervisors.

Such files will have no effect. The limits.conf files are processed by PAM, and when libvirt launches QEMU and sets its UID, PAM is not involved in any way.

IOW, if we need to set limits for QEMU, libvirt has to set them explicitly. The same would apply for other apps launching QEMU, unless they actually use 'su' to run QEMU as a diffferent account, which I don't believe any do.

--- Additional comment from Andrea Bolognani on 2016-06-14 07:14:00 EDT ---

(In reply to Daniel Berrange from comment #23)
> > I guess we would have to make qemu-kvm-rhev ship a
> > /etc/security/limits.d/qemu-kvm-rhev-memlock.conf file that
> > sets the new limit. It wouldn't make sense to raise the
> > limit for hosts that are not going to act as hypervisors.
> 
> Such files will have no effect. The limits.conf files are processed by PAM,
> and when libvirt launches QEMU and sets its UID, PAM is not involved in any
> way.
> 
> IOW, if we need to set limits for QEMU, libvirt has to set them explicitly.
> The same would apply for other apps launching QEMU, unless they actually use
> 'su' to run QEMU as a diffferent account, which I don't believe any do.

For user sessions, the libvirt daemon is autostarted and
will inherit the user's limits.

I tried dropping

  *       hard    memlock         64000
  *       soft    memlock         64000

in /etc/security/limits.d/qemu-kvm-rhev-memlock.conf and,
after logging out and in again, I was able to install a
guest and use guestfish from my unprivileged account.

--- Additional comment from Richard W.M. Jones on 2016-06-14 07:28:43 EDT ---

(In reply to Andrea Bolognani from comment #22)
> > (c) I'm not quite sure what "direct mode" entails.
> 
> Basically libguestfs will call QEMU itself instead of going
> through libvirt. guestfish will give you this hint:
> 
>   libguestfs: error: could not create appliance through libvirt.
> 
>   Try running qemu directly without libvirt using this environment variable:
>   export LIBGUESTFS_BACKEND=direct
> 
> and if you do that you'll of course be able to avoid the error
> raised by libvirt.
> 
> I don't know what other implications there are to using the
> direct backend, though. Rich?

It's not supported, nor encouraged in RHEL.  In this case it's a DIY
workaround, but it ought to be fixed in libvirt (or qemu, or wherever,
but in any case not by end users).

--- Additional comment from Andrea Bolognani on 2016-06-28 05:01:55 EDT ---

Moving this to qemu, as the only short-term (and possibly
long-term) solution seems to be the one outlined in
Comment 20 (proposal A) and POC-ed in Comment 24, ie. ship
a /etc/security/limits.d/qemu-memlock.conf file that raises
the memory locking limit to something like 64 MiB, thus
allowing regular users to run smallish guests.

Comment 1 Miroslav Rezanina 2016-07-01 08:24:34 UTC

Fix included in qemu-kvm-rhev-2.6.0-11.el7

Comment 3 David Gibson 2016-07-04 03:29:31 UTC

Andrea Bolognani and Richard Jones pointed out an error in my patch - it only allowed members of group kvm to have the increased locked memory limit.  I was working from the mistaken assumption that only members of this group could create KVM VMs anyway.

Moving back to ASSIGNED, until a fix for that problem is posted.

Comment 4 Miroslav Rezanina 2016-07-08 08:40:32 UTC

Fix included in qemu-kvm-rhev-2.6.0-12.el7

Comment 5 Zhengtong 2016-07-12 05:30:29 UTC

I reproduced the problem under RHEL7 guest at ppc64le platform

packages:
Host kernel:3.10.0-459.el7.ppc64le
qemu: qemu-kvm-rhev-2.6.0-9.el7

User is "testuser"(non-root)

bootcmd: 
virt-install --name=rhelguest --ram=4096 --vcpus=1 --os-type=linux --os-variant=rhel7 --arch ppc64le --machine pseries '--extra-args=console=tty0 console=hvc0' --disk /tmp/rhel7.2-ppc64le,size=6,format=raw --serial pty --location=http://download.eng.bos.redhat.com/rel-eng/RHEL-7.2-20151030.0/compose/Server/ppc64le/os/ --nographics --noreboot

result:
Starting install...
Retrieving file vmlinuz...                                                   |  18 MB  00:00:00     
Retrieving file initrd.img...                                                |  35 MB  00:00:00     
Allocating 'rhel7.2-ppc64le'                                                 | 6.0 GB  00:00:00     
ERROR    internal error: Process exited prior to exec: el0,id=channel0,name=org.qemu.guest_agent.0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 -msg timestamp=on
libvirt:  error : cannot limit locked memory to 46137344: Operation not permitted

===============================================================================


Only after upgrate the qemu version to qemu-kvm-rhev-2.6.0-12.el7.
Run the command again.

[testuser@ibm-p8-rhevm-13 ~]$ sh virt_boot.sh 
WARNING  No --console device added, you likely will not see text install output from the guest.

Starting install...
Retrieving file vmlinuz...                                                   |  18 MB  00:00:00     
Retrieving file initrd.img...                                                |  35 MB  00:00:00     
Allocating 'rhel7.2-ppc64le'                                                 | 6.0 GB  00:00:00     
Creating domain...                                                           |    0 B  00:00:00     
Connected to domain rhelguest
Escape character is ^]
Populating /vdevice methods
Populating /vdevice/vty@30000000
Populating /vdevice/nvram@71000000
Populating /pci@800000020000000
                     00 2800 (D) : 1af4 1002    unknown-legacy-device*
                     00 2000 (D) : 1af4 1001    virtio [ block ]
                     00 1800 (D) : 1af4 1003    communication-controller*
                     00 1000 (D) : 106b 003f    serial bus [ usb-ohci ]
                     00 0800 (D) : 1af4 1000    virtio [ net ]
No NVRAM common partition, re-initializing...
Scanning USB 
  OHCI: initializing
Using default console: /vdevice/vty@30000000
Detected RAM kernel at 400000 (15448c4 bytes)      
  Welcome to Open Firmware

  Copyright (c) 2004, 2011 IBM Corporation All rights reserved.
  This program and the accompanying materials are made available
  under the terms of the BSD License available at
  http://www.opensource.org/licenses/bsd-license.php

Booting from memory...
OF stdout device is: /vdevice/vty@30000000
Preparing to boot Linux version 3.10.0-327.el7.ppc64le (mockbuild.eng.bos.redhat.com) (gcc version 4.8.3 20140911 (Red Hat 4.8.3-9) (GCC) ) #1 SMP Thu Oct 29 17:31:13 EDT 2015
Detected machine type: 0000000000000101
Max number of cores passed to firmware: 2048 (NR_CPUS = 2048)
Calling ibm,client-architecture-support... done
command line: console=tty0 console=hvc0 inst.repo=http://download.eng.bos.redhat.com/rel-eng/RHEL-7.2-20151030.0/compose/Server/ppc64le/os/
memory layout at init:
  memory_limit : 0000000000000000 (16 MB aligned)
  alloc_bottom : 0000000001960000
  alloc_top    : 0000000030000000
  alloc_top_hi : 0000000100000000
  rmo_top      : 0000000030000000
  ram_top      : 0000000100000000
instantiating rtas at 0x000000002fff0000... done
prom_hold_cpus: skipped
copying OF device tree...
Building dt strings...
Building dt structure...
Device tree strings 0x0000000003c70000 -> 0x0000000003c70a1d
Device tree struct  0x0000000003c80000 -> 0x0000000003c90000
Calling quiesce...
returning from prom_init
[    0.000000] Using pSeries machine description
[    0.000000] Page sizes from device-tree:
[    0.000000] base_shift=12: shift=12, sllp=0x0000, avpnm=0x00000000, tlbiel=1, penc=0
[    0.000000] base_shift=16: shift=16, sllp=0x0110, avpnm=0x00000000, tlbiel=1, penc=1
[    0.000000] Using 1TB segments
[    0.000000] Found initrd at 0xc000000001960000:0xc000000003c3f90c
[    0.000000] bootconsole [udbg0] enabled
[    0.000000] Partition configured for 1 cpus.
[    0.000000] CPU maps initialized for 1 thread per core
[    0.000000] Starting Linux PPC64 #1 SMP Thu Oct 29 17:31:13 EDT 2015
[    0.000000] -----------------------------------------------------
[    0.000000] ppc64_pft_size                = 0x19
[    0.000000] physicalMemorySize            = 0x100000000
[    0.000000] htab_hash_mask                = 0x3ffff
[    0.000000] -----------------------------------------------------
....

It's good the boot up the guest

Comment 6 Richard W.M. Jones 2016-07-12 07:38:25 UTC

(In reply to Zhengtong from comment #5)
> I reproduced the problem under RHEL7 guest at ppc64le platform
> 
> packages:
> Host kernel:3.10.0-459.el7.ppc64le
> qemu: qemu-kvm-rhev-2.6.0-9.el7
> 
> User is "testuser"(non-root)

If you have time can you test it using the command

  libguestfs-test-tool

(as testuser / non-root)?  It should fail with the old qemu and
succeed with the new qemu.

Comment 7 Zhengtong 2016-07-12 09:49:47 UTC

(In reply to Richard W.M. Jones from comment #6)
> (In reply to Zhengtong from comment #5)
> > I reproduced the problem under RHEL7 guest at ppc64le platform
> > 
> > packages:
> > Host kernel:3.10.0-459.el7.ppc64le
> > qemu: qemu-kvm-rhev-2.6.0-9.el7
> > 
> > User is "testuser"(non-root)
> 
> If you have time can you test it using the command
> 
>   libguestfs-test-tool
> 
> (as testuser / non-root)?  It should fail with the old qemu and
> succeed with the new qemu.

I tested with libguestfs-test-tool with qemu-kvm-rhev-2.6.0-9 and qemu-kvm-rhev-2.6.0-12 , I have attached the log to the attachment. I can't say which result is good, because I am not familiar with libguestfs tools. but through what I can see in the log, they both run well.

Comment 8 Zhengtong 2016-07-12 09:51:16 UTC

Created attachment 1178824 [details]
qemu-kvm-rhev-2.6.0-9 libguestfs-test-tool log

Comment 9 Zhengtong 2016-07-12 09:51:44 UTC

Created attachment 1178825 [details]
qemu-kvm-rhev-2.6.0-12 libguestfs-test-tool log

Comment 10 Richard W.M. Jones 2016-07-12 10:17:22 UTC

I would have expected the test in comment 8 to fail with the
error "cannot limit locked memory to ...".  The test in comment 9
shows the expected output ("TEST FINISHED OK").

There are several reasons why it might have worked when it would
be expected to fail:

 - You didn't log out & log in after changing qemu version.  Therefore
   the ulimit was not changed.

 - Session libvirtd has a 30 second timeout, and was still running
   and still had the ulimit settings from the previous qemu installation.
   You can do (as non-root): 'killall libvirtd' to kill any previous
   session libvirtd that is hanging around.

Comment 12 Zhengtong 2016-07-13 05:34:54 UTC

Created attachment 1179057 [details]
qemu-kvm-rhev-2.6.0-9 libguestfs-test-tool log version 2 (failed)

Comment 13 Zhengtong 2016-07-13 05:35:54 UTC

Yes , after restarting libvirtd service and reconnecting the session , libguestfs-test-tool run failed , the log is attached.

Comment 14 Richard W.M. Jones 2016-07-13 09:11:12 UTC

Libvirt screws up the error message completely.

I guess you'll find the real error message in
~/.cache/libvirt/qemu/log/guestfs-nagohys0uvhu508w.log

Comment 15 Andrea Bolognani 2016-07-13 12:23:49 UTC

(In reply to Richard W.M. Jones from comment #14)
> Libvirt screws up the error message completely.
> 
> I guess you'll find the real error message in
> ~/.cache/libvirt/qemu/log/guestfs-nagohys0uvhu508w.log

I noticed this a while ago, but got sidetracked and forgot
to investigate further. I just filed about Bug 1356108 to
track the issue.

Comment 16 Zhengtong 2016-07-14 05:57:35 UTC

yes there is error in the log file 
libvirt:  error : cannot limit locked memory to 18874368: Operation not permitted

Comment 17 Zhengtong 2016-07-14 05:58:08 UTC

Created attachment 1179598 [details]
guestfs log

Comment 18 Qunfang Zhang 2016-07-14 06:16:58 UTC

Setting to VERIFIED according to comment 5.

Comment 20 errata-xmlrpc 2016-11-07 21:20:24 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-2673.html

Note You need to log in before you can comment on or make changes to this bug.