Hide Forgot
Created attachment 1150553 [details] Reproducer: short script, just 2 docker commands, annotated Versions: docker-latest-1.10.3-10, -11, -12 Start a container running "sleep X"; interrupt it (SIGTERM) before sleep finishes: sleep: cannot read realtime clock: Operation not permitted Exit status is 124. 100% reproducible; see attachment.
Any idea what is causing this? Is the realtime clock available in the container? I guess we need an strace of this command failing.
[root@rhel01 ~]# rpm -qa docker docker-1.10.3-31.el7.x86_64 [root@rhel01 ~]# cid=$(docker run -d --sig-proxy=true $img sleep 100) [root@rhel01 ~]# echo $cid 222cc0649424ca68e95cdd4857c2ef0e2b486784d393ae67982cecb5b4ddf87a [root@rhel01 ~]# timeout --kill-after=5 20 docker start -a -i $cid Killed [root@rhel01 ~]# cat /etc/os-release NAME="Red Hat Enterprise Linux Server" VERSION="7.2 (Maipo)" ID="rhel" ID_LIKE="fedora" VERSION_ID="7.2" PRETTY_NAME="Red Hat Enterprise Linux" ANSI_COLOR="0;31" CPE_NAME="cpe:/o:redhat:enterprise_linux:7.2:GA:server" HOME_URL="https://www.redhat.com/" BUG_REPORT_URL="https://bugzilla.redhat.com/" REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 7" REDHAT_BUGZILLA_PRODUCT_VERSION=7.2 REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux" REDHAT_SUPPORT_PRODUCT_VERSION="7.2" I cannot reproduce - Ed, could you try out the latest build?
Same thing with docker-latest-1.10.3-29.el7 : + timeout --kill-after=5 20 /usr/bin/docker-latest start -a -i c977b25b6eae6355f83f526841b138be7006d5ebefad01759e1ce30be82b7513 sleep: cannot read realtime clock: Operation not permitted + echo exit status: 124 exit status: 124 Sorry, I can't figure out how to switch back from docker-latest to docker (probably because this system used to have docker-1.9, and docker is now 1.10, and I need to migrate, but I'm on PTO and can't really take that time now). /etc/os-release looks identical to yours. This is a virt running under F22 libvirt, if that makes any difference.
Could you check to see if you got any AVCs? ausearch -m avc | grep svirt
>ausearch -m avc | grep svirt Lots and lots. However: # ausearch -m avc | grep svirt >/tmp/svirt-1 # ./tests/docker/sleepkill.sh ..... # ausearch -m avc | grep svirt >/tmp/svirt-2 # diff -uw /tmp/svirt-[12] -> nothing FWIW, tail -f /var/log/audit/audit.log emits this at the moment the sleep times out: type=VIRT_CONTROL msg=audit(1465214625.747:182868): pid=5853 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:docker_t:s0 msg='user=root auid=0 exe=sleep hostname=18d110cfc268 reason=api op=kill vm=registry.access.redhat.com/rhel7/rhel:latest vm-pid=1590 exe="/usr/bin/docker-latest" hostname=? addr=? terminal=? res=success' type=ANOM_PROMISCUOUS msg=audit(1465214626.002:182869): dev=veth89d0913 prom=0 old_prom=256 auid=4294967295 uid=0 gid=0 ses=4294967295 type=SYSCALL msg=audit(1465214626.002:182869): arch=c000003e syscall=44 success=yes exit=32 a0=13 a1=c208a5ca80 a2=20 a3=0 items=0 ppid=5851 pid=5860 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="docker-latest" exe="/usr/bin/docker-latest" subj=system_u:system_r:docker_t:s0 key=(null) type=VIRT_CONTROL msg=audit(1465214626.193:182870): pid=5853 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:docker_t:s0 msg='vm-pid=0 user=root auid=0 exe=sleep hostname=18d110cfc268 reason=api op=remove vm=registry.access.redhat.com/rhel7/rhel:latest exe="/usr/bin/docker-latest" hostname=? addr=? terminal=? res=success' Out now for real; will be back on June 20. Hope this helps.
Those a normal audit messages. nothing to do with SELinux.
Followup: reproduced on RHEL 7.2 virt (OSP7) with docker-1.10.3-44.el7.x86_64 and with setenforce 0.
Could this be seccomp causing the issue? If you disable seccomp does the command succeed? --security-opt seccomp=unconfined
Good call! With --security-opt seccomp:unconfined (colon, not equals): + timeout --kill-after=5 20 /usr/bin/docker start -a -i aa9b9b1d2d171206b66f399c9187ce53aa3a34d277d7887a4986cb010baaac75 /root/sleepkill.sh: line 17: 17447 Killed timeout --kill-after=5 20 $docker start -a -i $cid + echo exit status: 137 exit status: 137 No sign of the 'cannot read realtime clock' message.
Now you just need to figure out which syscall was being blocked. :^)
To find any seccomp denials reported via audit: # ausearch -m SECCOMP To convert syscall numbers to names (and the other way around): # ausearch [-a <arch>] <num>|<name>
To find any seccomp denials reported via audit: # ausearch -m SECCOMP To convert syscall numbers to names (and the other way around): # scmp_sys_resolver [-a <arch>] <num>|<name>
Sorry, you can disregard comment #12 (type), but rest assured comment #13 is bugzilla gold :)
This is very confusing. The 'nanosleep' syscall is the only thing sleep does that accesses the realtime clock, and it is certainly not blocked by the default Docker profile. Furthermore, strace finds no difference in which syscalls are made when sleep finishes normally and is killed by SIGTERM, so I can't tell why you'd be seeing denials only when SIGTERM is involved. Will keep digging into this to try and see what's going on.
(In reply to Paul Moore from comment #13) > To find any seccomp denials reported via audit: > > # ausearch -m SECCOMP I'm not sure what this is in reference to. Running this on the host produces <no matches>, as expected. I can find no way to run that inside the container itself. But I did discover that this change: - docker run ... sleep 100 + docker run ... bash -c '(/usr/bin/sleep 100)' ...results in: /root/sleepkill.sh: line 19: 21633 Killed timeout --kill-after=5 20 $docker start -a -i $cid ...with no 'cannot read realtime clock' warning. The container isn't really killed, though: the 'sleep 100' terminates, as shown by 'docker ps' for ~1m after the above message. I have no idea what to make of this.
The problem is the Audit messages are only generated for one type of ACTION, and docker does not default to the correct one, to generate the Message, I believe. Although I always forget which is which. Perhaps we should change the default. In another chain, I am discussing with people about potential debug tools that could help diagnose these types of issues.
Quick followup: this seems to be fixed in docker-1.12.0-6.el7.x86_64. On docker-1.10 the problem is 100% reproducible; on 1.12 I have never seen it in dozens of runs of the test suite.
Ed could you verify this is solved with the docker-latest available in brew?
Lokesh, can you give me a brew build ID or at least NVR? And are we talking docker-latest-1.10 or 1.12? If the latter, I can't even get docker to start (bug 1365207)
Problem still present in docker-1.10.3-46.el7.10.x86_64 on RHEL-7.2
Moving back to docker since we might need this fixed in 1.10.3, my bad about the back and forth.
Adding magic strings to make CI skip the failing tests b/c this bug.
Marking this as fixed in docker-latest-1.12.
Confirm fixed in docker-latest-1.12.0-16.el7
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release. Therefore, it is being closed. If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.