Bug 1765417

Summary: Podman container with CAP_SYS_ADMIN can not perform administration operation(mount) successfully
Product: Red Hat Enterprise Linux 8 Reporter: Zhi Li <yieli>
Component: container-selinuxAssignee: Daniel Walsh <dwalsh>
Status: CLOSED NOTABUG QA Contact: atomic-bugs <atomic-bugs>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 8.1CC: bfields, dwalsh, jiyin, jligon, jnovy, lsm5, mheon, steved, tsweeney, xzhou
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-11-20 15:39:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Strace output of mount procedure
none
avc.log none

Description Zhi Li 2019-10-25 03:14:54 UTC
Description of problem:
When runing podman container that  specifies --cap-add CAP_SYS_ADMIN(This option specifies that perform a range of system administration operations [1])  and performing nfs mount in it, but the result fails. which is a different behavior from docker that specifiles "--cap-add  SYS_ADMIN" . ( If this behavior is your expected.)

Version-Release number of selected component (if applicable):
kernel-4.18.0-147.el8.x86_64

Steps to Reproduce:
[root@xxx ~]# uname -r
4.18.0-147.el8.x86_64
[root@SERVER ~]# mkdir /export
mkdir: cannot create directory ‘/export’: File exists
[root@SERVER ~]# echo "/export *(rw,no_root_squash)" > /etc/exports
[root@SERVER ~]# systemctl restart nfs-server
[root@SERVER ~]#  exportfs -v
/export       <world>(sync,wdelay,hide,no_subtree_check,sec=sys,rw,secure,no_root_squash,no_all_squash)
###############
Test in podman:
###############
[root@SERVER ~]# podman run --cap-add CAP_SYS_ADMIN -it centos:8
                                     ^^^^^^^^^^^^^^^   This option specifies that perform a range of system administration operations  
[root@a179291fad10 /]# yum install -y nfs-utils &> /dev/null
[root@a179291fad10 /]# showmount -e $SERVER           
Export list for $SERVER:
/export *
[root@a179291fad10 /]# mount $SERVER:/export /mnt -vvv
mount.nfs: timeout set for Thu Oct 24 02:59:30 2019
mount.nfs: trying text-based options 'vers=4.2,addr=10.73.4.209,clientaddr=10.88.0.5'
mount.nfs: mount(2): No such device      <<<<<<<<<<<<<<<<<  fail 
mount.nfs: No such device                                                                                                   
(If this is your expectation, do you need to return a clear prompt so that the customer can easily understand the cause of the error)


###############
Test in docker:
###############
[root@SERVER ~]# docker run  --cap-add SYS_ADMIN -it centos:8
                                       ^^^^^^^^^
[root@0d85c14ee55c /]# showmount -e $SERVER
Export list for $SERVER:
/export *
[root@0d85c14ee55c /]# mount $SERVER:/export /mnt -vvv                <<<<<<<<<<< pass
mount.nfs: timeout set for Thu Oct 24 03:18:43 2019
mount.nfs: trying text-based options 'vers=4.2,addr=10.73.4.169,clientaddr=172.17.0.2'
[root@0d85c14ee55c /]# df -h | grep /mnt
Filesystem                                     1K-blocks    Used Available Use% Mounted on
$SERVER:/export  52399104 4132864  48266240   8% /mnt


Additional info:

[1] http://man7.org/linux/man-pages/man7/capabilities.7.html
>   CAP_SYS_ADMIN
              Note: this capability is overloaded; see Notes to kernel
              developers, below.

              * Perform a range of system administration operations
                including: quotactl(2), mount(2), umount(2), pivot_root(2),
                setdomainname(2);
...
Don't choose CAP_SYS_ADMIN if you can possibly avoid it!  A vast
proportion of existing capability checks are associated with this
capability (see the partial list above).  It can plausibly be
called "the new root", since on the one hand, it confers a wide
range of powers, and on the other hand, its broad scope means that
this is the capability that is required by many privileged
programs.

Comment 1 Matthew Heon 2019-10-25 13:41:13 UTC
If you run the container as privileged (`--privileged` instead of `--cap-add CAP_SYS_ADMIN`), does it work? Alternatively, try disabling SELinux (`sudo setenforce 0`) before running the container.

Comment 2 Daniel Walsh 2019-10-25 14:47:31 UTC
Most likely SELinux is blocking this access.  Adding SYS_ADMIN should fixup Capability checks and SECCOMP.

Comment 3 Zhi Li 2019-10-27 06:58:06 UTC
(In reply to Matthew Heon from comment #1)
> If you run the container as privileged (`--privileged` instead of `--cap-add
> CAP_SYS_ADMIN`), does it work? Alternatively, try disabling SELinux (`sudo
> setenforce 0`) before running the container.

Yes, It works, but whether it is a mistake of podman runtime?

Comment 4 Daniel Walsh 2019-10-27 08:39:05 UTC
Well it would be good to know what device nfs mount is complaining about.

Any chance you could strace the mount command to see if it gives you more information.

Comment 5 Zhi Li 2019-10-28 06:46:12 UTC
Created attachment 1629708 [details]
Strace output of mount procedure

Comment 6 Daniel Walsh 2019-10-28 13:24:55 UTC
Any chance this is just an issue of the network not being setup correctly in Podman.  Can you ping the NFS Server?

Comment 7 Matthew Heon 2019-10-28 13:27:44 UTC
Which of the two things I suggested fixed this - `--privileged` or `setenforce 0`?

Comment 8 Zhi Li 2019-10-29 02:36:31 UTC
(In reply to Daniel Walsh from comment #6)
> Any chance this is just an issue of the network not being setup correctly in
> Podman.  Can you ping the NFS Server?

Yes, I can

Comment 9 Zhi Li 2019-10-29 02:45:44 UTC
(In reply to Matthew Heon from comment #7)
> Which of the two things I suggested fixed this - `--privileged` or
> `setenforce 0`?

Up to your strategies, we just focus on the NFS related operations.

Comment 10 Daniel Walsh 2019-10-29 12:58:28 UTC
Zhi, we are trying to figure out if this is an SELinux issue or some other issue.

By setenforce 0, then we can see if SELinux is blocking the mount, If SELinux is disabled
or setenforce 0 does not fix the issue, then we have to look elsewhere.

Comment 13 Daniel Walsh 2019-10-30 17:52:25 UTC
@zhi Please gather the AVC's 

ausearch -m avc -ts recent

This might not be something that we want to fix in container selinux, but I need the AVCs to be sure.

 podman run --security-opt label=disabled --cap-add CAP_SYS_ADMIN -it centos:8

Should turn off SELinux separation for the container.

Comment 14 Daniel Walsh 2019-10-30 17:53:21 UTC
BTW if this works on Docker, I would say most likely the Docker daemon you are running is not running with the --selinux-enabled flag.

Comment 16 Zhi Li 2019-10-31 09:46:44 UTC
Created attachment 1630972 [details]
avc.log

Comment 17 Daniel Walsh 2019-10-31 17:23:48 UTC
THere is a boolean that allows this


setsebool -P domain_kernel_load_modules 1

But I would prefer that you did not do this.

A better solution would be to have the system just load the kernel module.

Can you get the system to load the kernel module on boot?

Something like:

echo nfs > /etc/modules-load.d/nfs

Comment 19 Daniel Walsh 2019-11-20 15:39:10 UTC
Yes this should have been closed notabug.