Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1374265 - [extras-rhel-7.3.0] docker-py 1.7.2 hangs when trying to access ctr stats against docker 1.10
[extras-rhel-7.3.0] docker-py 1.7.2 hangs when trying to access ctr stats aga...
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: docker (Show other bugs)
7.2
x86_64 Linux
unspecified Severity unspecified
: rc
: ---
Assigned To: Antonio Murdaca
atomic-bugs@redhat.com
: Extras
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2016-09-08 07:25 EDT by Thomas Wiest
Modified: 2016-11-04 05:09 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-11-04 05:09:23 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2016:2634 normal SHIPPED_LIVE Moderate: docker security and bug fix update 2016-11-03 16:51:48 EDT

  None (edit)
Description Thomas Wiest 2016-09-08 07:25:12 EDT
Description of problem:
When trying to access a container's stats using docker-py 1.7.2 against a docker 1.10 daemon, the call hangs for 2 minutes and then exits with the following error:

requests.exceptions.ReadTimeout: UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=120)


I'm able to do other operations like listing the containers. It only seems to hang / error on the stats call.

This works against docker 1.9.




Version-Release number of selected component (if applicable):
python-docker-py-1.7.2-1.el7.noarch
docker-1.10.3-46.el7.10.x86_64




How reproducible:
Very reproducible using this reproducer:


#!/usr/bin/env python

from docker import AutoVersionClient

cli = AutoVersionClient(base_url='unix://var/run/docker.sock', timeout=120)

first_ctr = cli.containers()[0]
name = first_ctr['Names'][0]
print name

stats = cli.stats(first_ctr['Id'], stream=False)
print stats





Steps to Reproduce:
1. Make sure you're running against docker-1.10 with docker-py 1.7.2
2. Run the reproducer script above
3. Notice that it just hangs after printing the name of the container. It hangs for 2 minutes and then errors out.
4. Run the same reproducer against docker-1.9 and notice that it works there.





Actual results:
$ time ./reproducer.py 
/k8s_fluentd-elasticsearch.c2b7b021_logging-fluentd-0fyea_logging_6d005e90-743d-11e6-a23f-0a3cb09b029f_132818d6
Traceback (most recent call last):
  File "./reproducer.py", line 11, in <module>
    stats = cli.stats(first_ctr['Id'], stream=False)
  File "/usr/lib/python2.7/site-packages/docker/utils/decorators.py", line 35, in wrapper
    return f(self, *args, **kwargs)
  File "/usr/lib/python2.7/site-packages/docker/utils/decorators.py", line 21, in wrapped
    return f(self, resource_id, *args, **kwargs)
  File "/usr/lib/python2.7/site-packages/docker/api/container.py", line 373, in stats
    return self._result(self._get(url, params={'stream': False}),
  File "/usr/lib/python2.7/site-packages/docker/utils/decorators.py", line 47, in inner
    return f(self, *args, **kwargs)
  File "/usr/lib/python2.7/site-packages/docker/client.py", line 112, in _get
    return self.get(url, **self._set_request_timeout(kwargs))
  File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 476, in get
    return self.request('GET', url, **kwargs)
  File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 464, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 576, in send
    r = adapter.send(request, **kwargs)
  File "/usr/lib/python2.7/site-packages/requests/adapters.py", line 433, in send
    raise ReadTimeout(e, request=request)
requests.exceptions.ReadTimeout: UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=120)

real	2m0.233s
user	0m0.085s
sys	0m0.027s






Expected results:
This is what happens when I run the same reproducer against docker-1.9.1-40.el7.x86_64

$ time ./reproducer.py 
/k8s_router.a7369d28_router-10-b2nrs_default_5f03cf62-7544-11e6-a051-124ba03dd4fb_6b159c64
{u'read': u'2016-09-08T11:15:32.691956665Z', u'memory_stats': {u'usage': 87830528, u'limit': 2000000000, u'failcnt': 0, u'stats': {u'unevictable': 0, u'total_inactive_file': 33382400 [... snip ...]

real	0m2.132s
user	0m0.139s
sys	0m0.116s
Comment 1 Thomas Wiest 2016-09-08 07:37:16 EDT
I tried upgrading to the latest docker-py using pip, and I got the same results.

So this doesn't look like this is fixed in newer versions of docker-py.

$ pip list | grep docker-py
You are using pip version 7.1.0, however version 8.1.2 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
docker-py (1.9.0)

$ time ./reproducer.py 
/k8s_fluentd-elasticsearch.c2b7b021_logging-fluentd-0fyea_logging_6d005e90-743d-11e6-a23f-0a3cb09b029f_132818d6
Traceback (most recent call last):
  File "./reproducer.py", line 11, in <module>
    stats = cli.stats(first_ctr['Id'], stream=False)
  File "/usr/lib/python2.7/site-packages/docker/utils/decorators.py", line 35, in wrapper
    return f(self, *args, **kwargs)
  File "/usr/lib/python2.7/site-packages/docker/utils/decorators.py", line 21, in wrapped
    return f(self, resource_id, *args, **kwargs)
  File "/usr/lib/python2.7/site-packages/docker/api/container.py", line 378, in stats
    return self._result(self._get(url, params={'stream': False}),
  File "/usr/lib/python2.7/site-packages/docker/utils/decorators.py", line 47, in inner
    return f(self, *args, **kwargs)
  File "/usr/lib/python2.7/site-packages/docker/client.py", line 140, in _get
    return self.get(url, **self._set_request_timeout(kwargs))
  File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 488, in get
    return self.request('GET', url, **kwargs)
  File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 475, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 596, in send
    r = adapter.send(request, **kwargs)
  File "/usr/lib/python2.7/site-packages/requests/adapters.py", line 499, in send
    raise ReadTimeout(e, request=request)
requests.exceptions.ReadTimeout: UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=120)

real	2m0.171s
user	0m0.083s
sys	0m0.023s
Comment 3 Tomas Tomecek 2016-09-09 06:00:22 EDT
I'm still not able to reproduce b/c my local libvirt isn't working and hence I can't even access rhel 7 (https://bugzilla.redhat.com/show_bug.cgi?id=1374325 https://bugzilla.redhat.com/show_bug.cgi?id=1374486), our internal openstack won't even start a vm -,-


So in the meantime I checked sources of docker engine and this is what I came up with:

code which forms the http request:

https://github.com/projectatomic/docker/blob/rhel7-1.10.3/vendor/src/github.com/docker/engine-api/client/container_stats.go#L10

code that processes the response:

https://github.com/projectatomic/docker/blob/rhel7-1.10.3/api/client/stats.go#L40


Interesting thing is that no matter if you specify `stream` or not, server always responds with `Transfer-Encoding: chunked` which is kind of odd. If the client is interested only in a single stats instance, the server should just send the json in a single response and close. Not return a stream. You can easily observe this:

$ ncat -U /var/run/docker.sock
GET /v1.22/containers/580dc1eaae69/stats?stream=0 HTTP/1.1  # press <enter> twice and wait for response

HTTP/1.1 200 OK
Content-Type: application/json
Server: Docker/1.10.3 (linux)
Date: Fri, 09 Sep 2016 09:57:20 GMT
Transfer-Encoding: chunked

1127
{"read":"2016-09-09T09:57:20.896712812Z","precpu_stats":{"cpu_usage":{"total_usage":981229


docker client handles this weird behavior by doing only a single iteration during processing response and then just breaks out of the loop [1]. At the same time it only prints a single stats instance with --no-stream. Funny thing is that client sets "timeout" to two seconds [2] and if it doesn't get a response in that timeframe it just prints all zeros. That's what you're seeing. So instead docker client printed "hey I received no stats" it just prints those zeros. To me, docker-py does this correctly, it keeps on waiting and eventually times out.

Hence it's pretty clear that the issue is in docker engine: it doesn't respond with stats correctly in time and that's why client just keeps on waiting getting nothing.


Therefore I'm reassigning to docker component.


[1] https://github.com/projectatomic/docker/blob/rhel7-1.10.3/api/client/stats.go#L86
[2] https://github.com/projectatomic/docker/blob/rhel7-1.10.3/api/client/stats.go#L93
Comment 4 Antonio Murdaca 2016-09-09 08:15:03 EDT
Is this happening with upstream docker and docker-py as well? I doubt this went unnoticed upstream, we probably need to backport a patch for this.
Comment 5 Thomas Wiest 2016-09-09 09:02:13 EDT
@amurdaca  Looks like upstream is experiencing something similar (found by ttomecek):

https://github.com/docker/docker-py/issues/1195
Comment 6 Tomas Tomecek 2016-09-09 10:51:52 EDT
I have found and fixed the issue in 1.10: https://github.com/projectatomic/docker/pull/195 right now I'm working on upstream.
Comment 7 Tomas Tomecek 2016-09-09 11:15:50 EDT
Upstream PR: https://github.com/docker/docker/pull/25905; my PR is just a backport.
Comment 8 Antonio Murdaca 2016-09-09 11:22:56 EDT
backported to docker-1.12 and 1.10.3 with https://github.com/projectatomic/docker/pull/195
Comment 11 errata-xmlrpc 2016-11-04 05:09:23 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2016-2634.html

Note You need to log in before you can comment on or make changes to this bug.