Bug 1380813 - Linux capabilities on container have no effect
Summary: Linux capabilities on container have no effect
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: docker
Version: 24
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Daniel Walsh
QA Contact: Fedora Extras Quality Assurance
URL: https://github.com/docker/docker/pull...
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-09-30 16:10 UTC by Jean-Christophe Berthon
Modified: 2016-10-04 12:06 UTC (History)
14 users (show)

Fixed In Version: 1.12.0
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-10-04 12:06:02 UTC
Type: Bug


Attachments (Terms of Use)

Description Jean-Christophe Berthon 2016-09-30 16:10:53 UTC
Description of problem:

I have created a Docker container to run a ntpd server. The container is based on Ubuntu 16.04 and install the ntp package from www.ntp.org.
The container can be successfully run on Ubuntu 16.04 (with AppArmor), on CentOS 7 (with SELinux enabled) but fails on Fedora 24 (irrespectively of the status of SELinux).
In order to run a ntp daemon in an unprivileged container, one has to add the SYS_TIME Linux capability. This can be done easily by using the "--cap-add SYS_TIME" option to Docker. It works well on Ubuntu and CentOS (so adjtime works) but fails with permission denied on Fedora 24 (tested with SELinux enforcing and disabled).

Strangely the ntpd process running inside the container has the correct Linux capabilities set. Seen from the host, I can see this:
Capabilities: "= cap_setgid,cap_setuid,cap_net_bind_service,cap_sys_chroot,cap_sys_time+ep"

It seems as if the capability was simply ignored by the Fedora kernel. I'm using kernel 4.7.2 and 4.7.4.

I've tried with the official Docker from Fedora repos (1.10.3) and I've even installed the latest Docker from Docker website (1.12.1), but it fails with the same error.

Version-Release number of selected component (if applicable):

Kernel 4.7.2 and 4.7.4
Docker 1.10.3 and 1.12.1

How reproducible:

Everytime. And I've reproduced it on 2 different Fedora 24 instance (on one computer, and on one VM). 

Steps to Reproduce:
1. You can use my Dockerfile: https://github.com/jcberthon/containers/tree/master/ntp
2. Simply put the Dockerfile in an empty directory, 'cd' to that directory
3. Build the image: `docker build -t jcberthon/ntpd` .
4. Run the image a first time and stop it after a few seconds, this will trigger SELinux because ntpd cannot do 'module_request' or something (Note: on CentOS 7 with SELinux, I do not have this issue!): check setroubleshoot, it will report this problem. I've created a SELinux policy and applied it. To run the image: `docker run --cap-add SYS_TIME --cap-add SYS_RESOURCE jcberthon/ntpd -g -n`
5. Rerun the image with the same syntax. This time SELinux is quiet. But it still shows erros such as "ntp_adjtime: Operation not permitted".
6. (optional) you can even do `sudo date -s "5 seconds ago"` to put your PC clock 5 seconds backward, you will see that ntp detects it but is unable to correct it. Whereas it work on CentOS 7 or Ubuntu 16.04.

Note: doing the above with SELinux disabled as the same outcome.

Actual results:
There is a permission denied.
The clock is not synchronised if step 6 is performed.

Expected results:
No permission denied printed, and when step 6 is performed, quickly after the clock is again synchronised.

Additional info:

Comment 1 Daniel Walsh 2016-09-30 16:47:25 UTC
This must be a missing seccomp field.

Try running it with --security-opt seccomp:unconfined

Comment 2 Jean-Christophe Berthon 2016-10-03 10:00:12 UTC
Hi Daniel,

Thank you for the quick reply.

I've added the option '--security-opt seccomp:unconfined' as you suggested. Running the container now does not yield the permission denied message.And after a few minutes, my clock is synchronised correctly.

I have no knowledge of seccomp, I understand that the option basically deactivate it. However, if I want to maintain seccomp how should I proceed? Is there a problem with the seccomp profile on Fedora (as opposed to the one on CentOS/Ubuntu)?

How can I further investigate?

Comment 3 Jean-Christophe Berthon 2016-10-03 13:32:13 UTC
Hi again,

Going one more time through the documentation and checking the default Docker seccomp profile (https://github.com/docker/docker/blob/master/profiles/seccomp/default.json) I can see that adjtimex (and other time-related system call) are disabled by the seccomp profile, at least they are not whitelisted, but it seems that they should be allowed if SYS_TIME is defined.

So, as far as I understood, adding the capability SYS_TIME with the --add-cap option, should have overridden the time-related system call inhibition by the seccomp profile from docker engine.

I can't seem to find this default seccomp profile on my computer to verify that it is similar to the one in git (for which I gave the above link). I did check the file contains in the Docker rpm, but could not find it, perhaps it was statically compiled in the Docker executable?!?

I was using today the docker package from Fedora repository (Docker 1.10), I'm going to try it again with Docker 1.12 from Docker repos, but as I reported last time, I was not luckier.

Comment 4 Jean-Christophe Berthon 2016-10-03 13:55:14 UTC
OK, this time I did a very thorough uninstallation of Docker 1.10.3. Remove the LVM volumes, deleted the /var/lib/docker folder and even rebooted, just to be sure. After that I installed Docker 1.12.1 from the Docker repos and when I run my container (after rebuilding it), I did not need to set the '--security-opt seccomp:unconfined' option, it worked.

So this is either a Docker 1.10.3 bug, or the Fedora installation of Docker has an incorrect seccomp policy regarding the SYS_TIME system calls.

Note: As mentioned in my bug report, both CentOS 7 and Ubuntu 16.04 where running 1.12.1 from Docker repository and not the official packages from the respective distributions.

Comment 5 Daniel Walsh 2016-10-03 14:26:45 UTC
I believe docker-1.12 added the seccomp via cap-add support.  This does not exist for docker-1.10. 

Check if the distributions docker-latest-1.12 code works properly.  If that does not work then this is a bug in the distributions docker, if it works then I will close this as not a bug.

When kubernetes/openshift supports docker-1.12 we will upgrate the default package to include it.

Comment 6 Jean-Christophe Berthon 2016-10-04 08:32:31 UTC
Hi Daniel,

You are correct, this is a "bug" in Docker prior to 1.12. In the change log of 1.12.0 is written "Align default seccomp profile with selected capabilities #22554". And the issue #22554: https://github.com/docker/docker/pull/22554

As I stated in comment #4 after careful uninstallation of the Fedora's docker 1.10.3, and installation of Docker 1.12.1 from Docker, the container has the correct permissions when running and is working as expected.

You can close this bug. I've tried to add the reference to the docker issue, hopefully I used the correct field.

Thank you for your support.


Note You need to log in before you can comment on or make changes to this bug.