Bug 1210364 - Embedded kubelet cadvisor fails to enumerate docker containers
Summary: Embedded kubelet cadvisor fails to enumerate docker containers
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: kubernetes
Version: 22
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Jan Chaloupka
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-04-09 14:09 UTC by Stef Walter
Modified: 2018-06-04 11:25 UTC (History)
10 users (show)

Fixed In Version: kubernetes-0.15.0-8.fc22
Clone Of:
Environment:
Last Closed: 2016-06-29 10:14:58 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Stef Walter 2015-04-09 14:09:09 UTC
Description of problem:

The cAdvisor embedded in kubelet fails to enumerate docker containers. It fails with the message:

failed to get all Docker containers with error: unable to find data for container /system.slice/docker-....scope

Version-Release number of selected component (if applicable):

kubernetes-0.13.2-0.5.git8d94c43.fc22.x86_64

How reproducible:

Every time. Tried restarting Docker.

Steps to Reproduce:
1. http://localhost:4194/api/v1.2/docker
1. Or just go to http://localhost:4194 and click on 'Docker Containers' to see the same thing.

Actual results:

failed to get all Docker containers with error: unable to find data for container /system.slice/docker-9dade0a01bf0368b6c0dad722f7e85af98593dde00873e6abb2cfd86d2d2b7a9.scope


Expected results:

Output like this:

{"/system.slice/docker-1a0bda816225dc1fcae4e9f0583f982664675f66d523095513ff01694c3311e1.scope":{"name":"/system.slice/docker-1a0bda816225dc1fcae4e9f0583f982664675f66d523095513ff01694c3311e1.scope","aliases":["k8s_ruby-helloworld.8ec853e6_frontend-58wav_default_19192267-d9c8-11e4-b2d6-10c37bdb8410_025ed744","1a0bda816225dc1fcae4e9f0583f982664675f66d523095513ff01694c3311e1"],"namespace":"docker","spec":{"creation_time":"0001-01-01T00:00:00Z","has_c...


Additional info:

When running standalone cadvisor I get the above expected output.

cadvisor-0.10.1-0.1.gitef7dddf.fc22.x86_64

Comment 1 Stef Walter 2015-04-09 14:10:12 UTC
Obviously when switching between kubelet and cadvisor you have to:

$ sudo systemctl stop kubelet
$ sudo systemctl start cadvisor

Or the both try to listen on the same port.

Comment 2 Stef Walter 2015-04-20 11:01:56 UTC
Confirming this is an upstream bug in cadvisor 0.10.1 ... if I build master it goes away.

Comment 3 Stef Walter 2015-04-20 11:11:44 UTC
git bisect is hard since various cAdvisor commits either OOM the entire machine, or don't build at all ... but:

 * 0.11.0 fails with the above error message 
 * 0.12.0 succeeds in this situation

Comment 4 Jan Chaloupka 2015-04-20 11:14:02 UTC
As kubernetes integrated cadvisor, I have obsoleted cadvisor.

Rawhide already obsoletes cadvisor, f22-f20 will obsolete at the end of this week.

Comment 5 Stef Walter 2015-04-20 11:33:07 UTC
The integrated cadvisor in kubernetes has this same problem. The component above is 'kubernetes'.

Comment 6 Stef Walter 2015-04-23 09:50:32 UTC
Work around is to run cadvisor standalone, and put this in /etc/kubernetes/kubelet

KUBELET_ARGS="--cadvisor_port=4192"

Comment 7 Stef Walter 2015-05-04 08:09:39 UTC
Seems to be fixed by: kubernetes-0.15.0-8.fc22

Comment 8 Jan Chaloupka 2015-05-04 08:20:57 UTC
Thanks Stef for the message.

Comment 9 Stef Walter 2015-05-07 12:49:19 UTC
Unfortunately I spoke too soon. I'm seeing this error again with an updated Fedora Atomic 22 that includes kubernetes-0.15.0-8.fc22

failed to get container "" with error: unable to find data for container /system.slice/docker-3ca350475bcc4b2fdb7dbb49b3dab55f7b47505dfcb93b4952d13a397dc4f07a.scope

Comment 10 Jan Chaloupka 2015-05-13 10:59:35 UTC
kubernetes-0.17.0 was released day ago.

Stef, can you check it out if this issue is still valid?

The latest build in koji is kubernetes-0.17.0-3.fc23.

Comment 11 Stef Walter 2015-05-13 14:52:40 UTC
Running 0.17.0 now. Will let you know if i run the issue again.

Comment 12 Stef Walter 2015-05-13 19:06:36 UTC
0.17.0 doesn't start containers for me. Since you asked, I'll post here, but happy to break this out into another bug if you want.

Mai 13 21:03:50 falcon.thewalter.lan docker[10292]: time="2015-05-13T21:03:50+02:00" level=error msg="Handler for POST /containers/{name:.*}/start returned error: Cannot start container 750db29d128126ac8f32ed6c9a89abc7c30616280c6b8e32fbf3e5ad6fbda04b: [8] System error: Unit name /-750db29d128126ac8f32ed6c9a89abc7c30616280c6b8e32fbf3e5ad6fbda04b.scope is not valid."
Mai 13 21:03:50 falcon.thewalter.lan kubelet[10510]: E0513 21:03:50.861443   10510 manager.go:1436] Failed to create pod infra container: API error (500): Cannot start container 750db29d128126ac8f32ed6c9a89abc7c30616280c6b8e32fbf3e5ad6fbda04b: [8] System error: Unit name /-750db29d128126ac8f32ed6c9a89abc7c30616280c6b8e32fbf3e5ad6fbda04b.scope is not valid.

docker-1.6.0-3.git9d26a07.fc22.x86_64
kubernetes-0.17.0-3.fc23.x86_64
etcd-2.0.8-0.1.fc22.x86_64

State: Image: mysql is ready, container is creating

Comment 13 Jan Chaloupka 2015-05-14 08:32:07 UTC
Great :) Thanks.

What does "Unit name /-750db29d128126ac8f32ed6c9a89abc7c30616280c6b8e32fbf3e5ad6fbda04b.scope is not valid." mean?

Looks like another bug, correct? Could be missing support of docker for this unit name?

> happy to break this out into another bug if you want

What component?

Comment 14 Stef Walter 2015-05-15 06:41:20 UTC
It's likely related to https://github.com/docker/docker/issues/7015

But cAdvisor (and/or kubelet) should account for that.

Comment 15 Jan Chaloupka 2016-02-02 12:11:15 UTC
Does it still occur?

Comment 16 Andy Goldstein 2016-06-24 13:38:19 UTC
Stef, could you please let us know if this is still an issue?

Comment 17 Jan Chaloupka 2016-06-29 10:14:58 UTC
Currently, the f23 stable repository provides kubernetes-1.2.0-0.18.git4a3f9c5.fc23 build. Far from 0.17. At the same time the docker has been updated many times.

If the issue still persists, please reopen the bug. Closing for now.


Note You need to log in before you can comment on or make changes to this bug.