Bug 1210364

Summary: Embedded kubelet cadvisor fails to enumerate docker containers
Product: [Fedora] Fedora Reporter: Stef Walter <stefw>
Component: kubernetesAssignee: Jan Chaloupka <jchaloup>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 22CC: agoldste, eparis, golang-updates, jchaloup, jvance, lsm5, nhorman, smodeel, stefw, vbatts
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: kubernetes-0.15.0-8.fc22 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-06-29 10:14:58 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Stef Walter 2015-04-09 14:09:09 UTC
Description of problem:

The cAdvisor embedded in kubelet fails to enumerate docker containers. It fails with the message:

failed to get all Docker containers with error: unable to find data for container /system.slice/docker-....scope

Version-Release number of selected component (if applicable):

kubernetes-0.13.2-0.5.git8d94c43.fc22.x86_64

How reproducible:

Every time. Tried restarting Docker.

Steps to Reproduce:
1. http://localhost:4194/api/v1.2/docker
1. Or just go to http://localhost:4194 and click on 'Docker Containers' to see the same thing.

Actual results:

failed to get all Docker containers with error: unable to find data for container /system.slice/docker-9dade0a01bf0368b6c0dad722f7e85af98593dde00873e6abb2cfd86d2d2b7a9.scope


Expected results:

Output like this:

{"/system.slice/docker-1a0bda816225dc1fcae4e9f0583f982664675f66d523095513ff01694c3311e1.scope":{"name":"/system.slice/docker-1a0bda816225dc1fcae4e9f0583f982664675f66d523095513ff01694c3311e1.scope","aliases":["k8s_ruby-helloworld.8ec853e6_frontend-58wav_default_19192267-d9c8-11e4-b2d6-10c37bdb8410_025ed744","1a0bda816225dc1fcae4e9f0583f982664675f66d523095513ff01694c3311e1"],"namespace":"docker","spec":{"creation_time":"0001-01-01T00:00:00Z","has_c...


Additional info:

When running standalone cadvisor I get the above expected output.

cadvisor-0.10.1-0.1.gitef7dddf.fc22.x86_64

Comment 1 Stef Walter 2015-04-09 14:10:12 UTC
Obviously when switching between kubelet and cadvisor you have to:

$ sudo systemctl stop kubelet
$ sudo systemctl start cadvisor

Or the both try to listen on the same port.

Comment 2 Stef Walter 2015-04-20 11:01:56 UTC
Confirming this is an upstream bug in cadvisor 0.10.1 ... if I build master it goes away.

Comment 3 Stef Walter 2015-04-20 11:11:44 UTC
git bisect is hard since various cAdvisor commits either OOM the entire machine, or don't build at all ... but:

 * 0.11.0 fails with the above error message 
 * 0.12.0 succeeds in this situation

Comment 4 Jan Chaloupka 2015-04-20 11:14:02 UTC
As kubernetes integrated cadvisor, I have obsoleted cadvisor.

Rawhide already obsoletes cadvisor, f22-f20 will obsolete at the end of this week.

Comment 5 Stef Walter 2015-04-20 11:33:07 UTC
The integrated cadvisor in kubernetes has this same problem. The component above is 'kubernetes'.

Comment 6 Stef Walter 2015-04-23 09:50:32 UTC
Work around is to run cadvisor standalone, and put this in /etc/kubernetes/kubelet

KUBELET_ARGS="--cadvisor_port=4192"

Comment 7 Stef Walter 2015-05-04 08:09:39 UTC
Seems to be fixed by: kubernetes-0.15.0-8.fc22

Comment 8 Jan Chaloupka 2015-05-04 08:20:57 UTC
Thanks Stef for the message.

Comment 9 Stef Walter 2015-05-07 12:49:19 UTC
Unfortunately I spoke too soon. I'm seeing this error again with an updated Fedora Atomic 22 that includes kubernetes-0.15.0-8.fc22

failed to get container "" with error: unable to find data for container /system.slice/docker-3ca350475bcc4b2fdb7dbb49b3dab55f7b47505dfcb93b4952d13a397dc4f07a.scope

Comment 10 Jan Chaloupka 2015-05-13 10:59:35 UTC
kubernetes-0.17.0 was released day ago.

Stef, can you check it out if this issue is still valid?

The latest build in koji is kubernetes-0.17.0-3.fc23.

Comment 11 Stef Walter 2015-05-13 14:52:40 UTC
Running 0.17.0 now. Will let you know if i run the issue again.

Comment 12 Stef Walter 2015-05-13 19:06:36 UTC
0.17.0 doesn't start containers for me. Since you asked, I'll post here, but happy to break this out into another bug if you want.

Mai 13 21:03:50 falcon.thewalter.lan docker[10292]: time="2015-05-13T21:03:50+02:00" level=error msg="Handler for POST /containers/{name:.*}/start returned error: Cannot start container 750db29d128126ac8f32ed6c9a89abc7c30616280c6b8e32fbf3e5ad6fbda04b: [8] System error: Unit name /-750db29d128126ac8f32ed6c9a89abc7c30616280c6b8e32fbf3e5ad6fbda04b.scope is not valid."
Mai 13 21:03:50 falcon.thewalter.lan kubelet[10510]: E0513 21:03:50.861443   10510 manager.go:1436] Failed to create pod infra container: API error (500): Cannot start container 750db29d128126ac8f32ed6c9a89abc7c30616280c6b8e32fbf3e5ad6fbda04b: [8] System error: Unit name /-750db29d128126ac8f32ed6c9a89abc7c30616280c6b8e32fbf3e5ad6fbda04b.scope is not valid.

docker-1.6.0-3.git9d26a07.fc22.x86_64
kubernetes-0.17.0-3.fc23.x86_64
etcd-2.0.8-0.1.fc22.x86_64

State: Image: mysql is ready, container is creating

Comment 13 Jan Chaloupka 2015-05-14 08:32:07 UTC
Great :) Thanks.

What does "Unit name /-750db29d128126ac8f32ed6c9a89abc7c30616280c6b8e32fbf3e5ad6fbda04b.scope is not valid." mean?

Looks like another bug, correct? Could be missing support of docker for this unit name?

> happy to break this out into another bug if you want

What component?

Comment 14 Stef Walter 2015-05-15 06:41:20 UTC
It's likely related to https://github.com/docker/docker/issues/7015

But cAdvisor (and/or kubelet) should account for that.

Comment 15 Jan Chaloupka 2016-02-02 12:11:15 UTC
Does it still occur?

Comment 16 Andy Goldstein 2016-06-24 13:38:19 UTC
Stef, could you please let us know if this is still an issue?

Comment 17 Jan Chaloupka 2016-06-29 10:14:58 UTC
Currently, the f23 stable repository provides kubernetes-1.2.0-0.18.git4a3f9c5.fc23 build. Far from 0.17. At the same time the docker has been updated many times.

If the issue still persists, please reopen the bug. Closing for now.