Bug 917708

Summary: Re-enable CONFIG_USER_NS
Product: [Fedora] Fedora Reporter: Daniel Berrangé <berrange>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: batrick, bertrand.noel, cscsordas, drjohnson1, fullung, fweimer, gansalmon, itamar, ja, jonathan, jpeeler, kernel-maint, luto, maci, madhu.chinakonda, mitr, oskari, pmatouse
Target Milestone: ---Keywords: FutureFeature, Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: kernel-3.15.10-201.fc20 Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of:
: 918577 (view as bug list) Environment:
Last Closed: 2014-08-22 17:24:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Daniel Berrangé 2013-03-04 15:35:05 UTC
Description of problem:
A while back we disabled CONFIG_USER_NS. This reflected the fact that existing user namespace support was essentially fubar

commit 1f83f3a2352f83f2718e6b0a97e1f0737721135a
Author: Josh Boyer <jwboyer>
Date:   Fri May 25 16:14:15 2012 -0400

    Drop CONFIG_USER_NS
    
    Upstream e1c972b681bf118fcedb9fe2ed7a73de983aa5ef makes it depend on
    UIDGID_CONVERTED which is only set when all of the subsystems have been
    converted to be user namespace safe.  That defaults to Y whenever it happens,
    so we'll set this after that point.


As of 3.8 there is the foundation of new user namespace support merged, and 3.9 kernel improve on it further. We want to support user namespacs in libvirt for the LXC driver, so require CONFIG_USER_NS to be re-enabled in F19 kernels.

We did recently enable CONFIG_NAMESPACES

commit f39ad019437fa300f5d2e05ee89154825cae737c
Author: Josh Boyer <jwboyer>
Date:   Wed Feb 6 11:44:09 2013 -0500

    Enable CONFIG_NAMESPACES everywhere (rhbz 907576)


but even though CONFIG_NAMESPACES is documented as enabling all namespaces, it does not in fact enable CONFIG_USER_NS.

Version-Release number of selected component (if applicable):
kernel-3.9.0-0.rc0.git14.1.fc19.x86_64

How reproducible:
Always

Steps to Reproduce:
1. ls /proc/self/ns
2.
3.
  
Actual results:
'user' is not listed

Expected results:
'user' is listed

Additional info:

Comment 1 Josh Boyer 2013-03-04 15:52:38 UTC
3.8 required a bunch of filesystems to still disabled before USER_NS could be turned on.  3.9 improves as you said, but it still depends on UIDGID_CONVERTED and that is such:

config UIDGID_CONVERTED
        # True if all of the selected software conmponents are known
        # to have uid_t and gid_t converted to kuid_t and kgid_t
        # where appropriate and are otherwise safe to use with
        # the user namespace.
        bool
        default y

        # Filesystems
        depends on XFS_FS = n


So we would have to turn off XFS in order to enable USER_NS.  We're not going to do that.

Comment 2 Daniel Berrangé 2013-03-04 15:58:52 UTC
Looks like XFS is the only think blocking this though, so assuming we got a solution to XFS, would you be ok to enable it?

Comment 3 Josh Boyer 2013-03-04 16:03:13 UTC
(In reply to comment #2)
> Looks like XFS is the only think blocking this though, so assuming we got a
> solution to XFS, would you be ok to enable it?

As far as I know, yes.  Eric Biederman said:

"XFS is the only filesystem that remains.  I was hoping I could get that
in this release so that user namespace support would be enabled with an
allyesconfig or an allmodconfig but it looks like the xfs changes need
another couple of days before it they are ready."

in his upstream 3.9 pull request.  Whether "couple of days" means "I'll submit this after the merge window closes" or "I'll submit this for 3.10" remains to be seen.

Comment 4 Daniel Berrangé 2013-03-06 11:41:00 UTC
3.9-rc1 is out and the XFS stuff is not included, so looks like we're heading for 3.10 territory now.

Comment 5 Petr Matousek 2013-03-13 16:42:28 UTC
Unprivileged user namespaces are not ready. Here's couple of very recent examples why:

  * http://stealth.openwall.net/xSports/clown-newuser.c

  * https://lkml.org/lkml/2013/3/1/603

  * https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=db04dc679bcc780ad6907943afe24a30de974a1b

  * https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=c61a2810a2161986353705b44d9503e6bb079f4f

Enabling CONFIG_USER_NS on a kernel with upstream commit 5eaf563e53294d6696e651466697eb9d491f3946 (userns: Allow unprivileged users to create user namespaces) that removes the privilege checks when requesting CLONE_NEWUSER would currently put our users at big risk.

--
Petr Matousek / Red Hat Security Response Team

Comment 6 Josh Boyer 2013-03-13 17:20:21 UTC
(In reply to comment #5)
> Unprivileged user namespaces are not ready. Here's couple of very recent
> examples why:
> 
>   * http://stealth.openwall.net/xSports/clown-newuser.c
> 
>   * https://lkml.org/lkml/2013/3/1/603
> 
>   *
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/
> ?id=db04dc679bcc780ad6907943afe24a30de974a1b
> 
>   *
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/
> ?id=c61a2810a2161986353705b44d9503e6bb079f4f
> 
> Enabling CONFIG_USER_NS on a kernel with upstream commit
> 5eaf563e53294d6696e651466697eb9d491f3946 (userns: Allow unprivileged users
> to create user namespaces) that removes the privilege checks when requesting
> CLONE_NEWUSER would currently put our users at big risk.

Yeah.  Hopefully all that gets cleared up for 3.10.  I'm putting the FutureFeature keywork here to make it clear we aren't doing this soon.

Comment 7 Daniel Berrangé 2013-09-19 16:26:03 UTC
FYI upstream has now converted XFS to work with user ns:

  commit d6970d4b726cea6d7a9bc4120814f95c09571fc3
  Author: Dwight Engen <dwight.engen>
  Date:   Thu Aug 15 14:08:04 2013 -0400

    enable building user namespace with xfs
    
    Reviewed-by: Dave Chinner <dchinner>
    Reviewed-by: Gao feng <gaofeng.com>
    Signed-off-by: Dwight Engen <dwight.engen>
    Signed-off-by: Ben Myers <bpm>
. 
The Fedora 21 rawhide kernel looks like it includes this change, so cn we get USER_NS enabled in Fedora rawhide now.

If there are still security concerns, then I would be ok with having change 5eaf563e53294d6696e651466697eb9d491f3946 reverted in Fedora kernels initially. This would let privileged users make use of user namespaces, without increasing attack surface exposed to non-privileged users of the host.

Libvirt only needs to be able to use CLONE_NEWUSER when running as root, so this would be sufficient to let libvirt with with user namespaces in Fedora iniitally.

Comment 8 Petr Matousek 2013-09-19 16:51:25 UTC
(In reply to Daniel Berrange from comment #7)
> If there are still security concerns, then I would be ok with having change
> 5eaf563e53294d6696e651466697eb9d491f3946 reverted in Fedora kernels
> initially. This would let privileged users make use of user namespaces,
> without increasing attack surface exposed to non-privileged users of the
> host.
> 
> Libvirt only needs to be able to use CLONE_NEWUSER when running as root, so
> this would be sufficient to let libvirt with with user namespaces in Fedora
> iniitally.

I'm fine with this approach.

Thanks,
Petr

Comment 9 Josh Boyer 2013-11-13 19:15:06 UTC
My apologies for the delay here.

I've taken the suggestion Daniel made and pushed it to today's rawhide git tree.  The first kernel to contain the support should be:

kernel-3.13.0-0.rc0.git3.2.fc21

and will hopefully be in rawhide tomorrow.

I'm going to leave this bug as modified and I would really appreciate it if people could test the kernel and tell me if it's suitable for their needs.

Comment 10 Daniel Berrangé 2013-11-14 09:00:44 UTC
Thanks Josh, I'll do some tests from libvirt's POV and report back.

Comment 11 Josh Boyer 2013-12-04 21:28:51 UTC
(In reply to Daniel Berrange from comment #10)
> Thanks Josh, I'll do some tests from libvirt's POV and report back.

Did that happen?

Comment 12 Daniel Berrangé 2013-12-05 09:52:18 UTC
Opps, yes. This is basically working from libvirt's POV. We did find one regression in the 3.12 kernel, but it isn't a show stopper so for now we'll just fine to wait for upstream fix to trickle down.

Comment 13 Daniel Berrangé 2013-12-05 09:53:37 UTC
For reference the bug I was referring to above is this one https://lists.linuxfoundation.org/pipermail/containers/2013-November/033635.html

Comment 14 Josh Boyer 2013-12-16 15:19:04 UTC
OK, close this out then.

Comment 15 Andy Lutomirski 2014-08-22 00:10:52 UTC
It looks like unprivileged clone(CLONE_NEWUSER) is disabled, but unprivileged unshare(CLONE_NEWUSER) is enabled.  This makes no sense.

This appears to be caused by:

ApplyPatch Revert-userns-Allow-unprivileged-users-to-create-use.patch

Can we just drop this?  This creates a weird incompatibility between Fedora and everything else.

Comment 16 Josh Boyer 2014-08-22 00:58:53 UTC
(In reply to Andy Lutomirski from comment #15)
> It looks like unprivileged clone(CLONE_NEWUSER) is disabled, but
> unprivileged unshare(CLONE_NEWUSER) is enabled.  This makes no sense.

Yeah, Petr noticed this last week and I forgot about it over the weekend.

> This appears to be caused by:
> 
> ApplyPatch Revert-userns-Allow-unprivileged-users-to-create-use.patch
> 
> Can we just drop this?  This creates a weird incompatibility between Fedora
> and everything else.

Maybe.  It's either drop it or disallow unshare(CLONE_NEWUSER).

Originally we went with the revert because userns had a number of exploitable CVEs when it came out.  I believe there were more than a few even after we started carrying that patch.  We fixed them all anyway, but the idea was to mitigate things until userns matured.  It's not immediately obvious to me that it's matured, but I would allow that it's gotten somewhat better in the meantime.

Comment 17 Josh Boyer 2014-08-22 01:04:02 UTC
FWIW, Petr found that Ubuntu has a somewhat more complete approach to disallow both here:

http://kernel.ubuntu.com/git?p=serge/ubuntu-saucy.git;a=commitdiff;h=5c847404dcb2e

so it isn't just Fedora that is changing the behavior here.

Comment 18 Andy Lutomirski 2014-08-22 01:09:46 UTC
Serge and I were discussing an LSM hook to control userns creation at KS/LSS this week.

In any case, I think his patch should be checking ns_capable relative to the would-be parent of the new namespace, not relative to the root userns.  After all, none of the userns holes so far have allowed one to break out of the parent namespace in a way that couldn't be done just as easily by someone with CAP_SYS_ADMIN in the parent.

Comment 19 Josh Boyer 2014-08-22 16:34:49 UTC
We talked about this a bit today.  I think we're going to just drop the patch entirely and go with upstream here.  Once more unto the breach and all that...

Comment 20 Josh Boyer 2014-08-22 17:24:40 UTC
Done in Fedora git.  It will be in the next respective builds done.

Comment 21 Fedora Update System 2014-08-28 12:17:51 UTC
kernel-3.15.10-201.fc20 has been submitted as an update for Fedora 20.
https://admin.fedoraproject.org/updates/kernel-3.15.10-201.fc20

Comment 22 Fedora Update System 2014-08-30 03:58:09 UTC
kernel-3.15.10-201.fc20 has been pushed to the Fedora 20 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 23 Patrick Donnelly 2014-11-10 21:33:01 UTC
Is there any chance of this, unprivileged user namespaces, getting backported to 3.10 for RHEL7?

Comment 24 Marcel Wysocki 2015-03-06 12:34:46 UTC
+1 for support in RHEL7

Comment 25 Josh Boyer 2015-03-06 13:00:53 UTC
Feature requests for RHEL need to be done either via the Customer Portal or through a bug reported against the RHEL product.  Comments in this bug will not be seen by the appropriate people.  Thank you.