Bug 1579814
Summary: | Incorrect exit code of command executed inside container via python API | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Daniele <dconsoli> |
Component: | python-docker-py | Assignee: | Tomas Tomecek <ttomecek> |
Status: | CLOSED ERRATA | QA Contact: | atomic-bugs <atomic-bugs> |
Severity: | medium | Docs Contact: | |
Priority: | unspecified | ||
Version: | 7.5 | CC: | amurdaca, dornelas, lsu |
Target Milestone: | rc | Keywords: | Extras |
Target Release: | --- | ||
Hardware: | All | ||
OS: | All | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2018-06-26 18:36:32 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1186913, 1561841 |
Description
Daniele
2018-05-18 11:53:47 UTC
Now I understand the confusion. The problem Motorola is seeing is actually a race condition: the moment they invoke `c.exec_start(id); pprint.pprint(c.exec_inspect(id)`, the command haven't finished yet and hence dockerd reports `ExitCode: None`. I played with this a bit and came up with this solution: import docker c = docker.Client() c.create_container(image="rhel7", name="python_test", command="sleep infinity", tty=True, stdin_open=True) c.start("python_test") id = c.exec_create("python_test", "ls -lha /", tty=True) stream = c.exec_start(id, detach=False, stream=True) for i in stream: print(i) # will print output of the command to stdout ret = c.exec_inspect(id) print("Exit code: %s" % ret["ExitCode"]) And running it: [root@vm ~]# python t.py total 0 drwxr-xr-x. 1 root root 6 May 18 14:40 . drwxr-xr-x. 1 root root 6 May 18 14:40 .. -rwxr-xr-x. 1 root root 0 May 18 14:40 .dockerenv lrwxrwxrwx. 1 root root 7 Apr 4 17:06 bin -> usr/bin dr-xr-xr-x. 2 root root 6 Dec 14 17:23 boot drwxr-xr-x. 5 root root 360 May 18 14:40 dev drwxr-xr-x. 1 root root 66 May 18 14:40 etc drwxr-xr-x. 2 root root 6 Apr 4 17:08 home lrwxrwxrwx. 1 root root 7 Apr 4 17:06 lib -> usr/lib lrwxrwxrwx. 1 root root 9 Apr 4 17:06 lib64 -> usr/lib64 drwxr-xr-x. 2 root root 6 Dec 14 17:23 media drwxr-xr-x. 2 root root 6 Dec 14 17:23 mnt drwxr-xr-x. 2 root root 6 Dec 14 17:23 opt dr-xr-xr-x. 108 root root 0 May 18 14:40 proc dr-xr-x---. 1 root root 23 Apr 4 17:12 root drwxr-xr-x. 1 root root 21 Apr 4 17:12 run lrwxrwxrwx. 1 root root 8 Apr 4 17:06 sbin -> usr/sbin drwxr-xr-x. 2 root root 6 Dec 14 17:23 srv dr-xr-xr-x. 13 root root 0 May 18 14:14 sys drwxrwxrwt. 7 root root 132 Apr 4 17:08 tmp drwxr-xr-x. 13 root root 155 Apr 4 17:06 usr drwxr-xr-x. 18 root root 238 Apr 4 17:06 var Exit code: 0 When I used the touch command, I was still hitting the race condition, probably because 'touch' doesn't output anything to stdout (is this a bug in dockerd)? I also recommend reading docker's source code on how it handles exec: https://github.com/projectatomic/docker/blob/docker-1.13.1/cli/command/container/exec.go https://github.com/projectatomic/docker/blob/docker-1.13.1/client/container_exec.go Please let me know whether the proposed solution works for Motorola. If this is a regression, it is possible that this has changed in docker daemon. Antonio, would you know more about this? From my PoV, it's not a problem to add something like this into docker-py's method exec_start: if not detach: while True: ret = c.exec_inspect(id) if ret["ExitCode"] is not None: break sleep(0.01) To me, it really sounds like there is a problem in detach mechanic. I finally figured this out! This is a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1501782 I was able to reproduce. As soon as I used the update from the bug above, the problem was gone. Attaching test case for QE: import docker import pprint import time d = docker.Client() runs_count = 10 for tty_value in (True, False): for i in range(runs_count): co = d.create_container(image="rhel7", command="sleep infinity", tty=tty_value) d.start(co) ex = d.exec_create(co, "touch /test", tty=tty_value) d.exec_start(ex, detach=False) # sl = float(i) / float(1000) sl = 0.001 * (2 ** i) time.sleep(sl) ins = d.exec_inspect(ex) ec = ins["ExitCode"] pprint.pprint("[%d] Exit code: %s, tty: %s, sleep: %s" % (i, ec, tty_value, sl)) if ec is None: print "FAILURE" Tried with: python-docker-py-1.10.6-4.el7.noarch docker-1.13.1-68.gitdded712.el7.x86_64 From the #Comment 0 # docker exec -ti python_test ls / bin dev home lib64 media opt root sbin sys tmp var boot etc lib lost+found mnt proc run srv test usr # echo $? 0 # docker exec -ti cli_test ls / bin dev home lib64 media opt root sbin sys tmp var boot etc lib lost+found mnt proc run srv test usr # echo $? 0 From the #Comment 6 # python 1.py '[0] Exit code: 0, tty: True, sleep: 0.001' '[1] Exit code: 0, tty: True, sleep: 0.002' '[2] Exit code: 0, tty: True, sleep: 0.004' '[3] Exit code: 0, tty: True, sleep: 0.008' '[4] Exit code: 0, tty: True, sleep: 0.016' '[5] Exit code: 0, tty: True, sleep: 0.032' '[6] Exit code: 0, tty: True, sleep: 0.064' '[7] Exit code: 0, tty: True, sleep: 0.128' '[8] Exit code: 0, tty: True, sleep: 0.256' '[9] Exit code: 0, tty: True, sleep: 0.512' '[0] Exit code: 0, tty: False, sleep: 0.001' '[1] Exit code: 0, tty: False, sleep: 0.002' '[2] Exit code: 0, tty: False, sleep: 0.004' '[3] Exit code: 0, tty: False, sleep: 0.008' '[4] Exit code: 0, tty: False, sleep: 0.016' '[5] Exit code: 0, tty: False, sleep: 0.032' '[6] Exit code: 0, tty: False, sleep: 0.064' '[7] Exit code: 0, tty: False, sleep: 0.128' '[8] Exit code: 0, tty: False, sleep: 0.256' '[9] Exit code: 0, tty: False, sleep: 0.512' Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:2025 |