Description of problem: Node logs contain the following error: -------------- atomic-openshift-node[2601]: E0828 09:13:38.702385 2601 summary.go:102] Failed to get system container stats for "/system.slice/atomic-openshift-node.service": failed to get cgroup stats for "/system.slice/atomic-openshift-node.service": failed to get container info for "/system.slice/atomic-openshift-node.service": unknown container "/system.slice/atomic-openshift-node.service" -------------- How reproducible: Seems consistent Steps to Reproduce: 1. Deploy an OpenShift 3.10 cluster Actual results: Error is continually logged. There doesn't seem to be any severe side effects but the error adds noise to the log output. Expected results: Error wouldn't occur under normal circumstances.
Seth, could you take a look? This is probably coming from cadvisor.
Same here: # rpm -qa |grep openshift atomic-openshift-clients-3.10.14-1.git.0.ba8ae6d.el7.x86_64 atomic-openshift-docker-excluder-3.10.14-1.git.0.ba8ae6d.el7.noarch atomic-openshift-3.10.14-1.git.0.ba8ae6d.el7.x86_64 atomic-openshift-excluder-3.10.14-1.git.0.ba8ae6d.el7.noarch atomic-openshift-hyperkube-3.10.14-1.git.0.ba8ae6d.el7.x86_64 atomic-openshift-node-3.10.14-1.git.0.ba8ae6d.el7.x86_64
3.10.34 showing a similar error for the docker service: ----------------- Sep 5 09:31:22 vmlxopencd01 atomic-openshift-node: E0905 09:31:22.086411 18485 summary.go:102] Failed to get system container stats for "/system.slice/atomic-openshift-node.service": failed to get cgroup stats for "/system.slice/atomic-openshift-node.service": failed to get container info for "/system.slice/atomic-openshift-node.service": unknown container "/system.slice/atomic-openshift-node.service" Sep 5 09:31:22 vmlxopencd01 atomic-openshift-node: E0905 09:31:22.086841 18485 summary.go:102] Failed to get system container stats for "/system.slice/docker.service": failed to get cgroup stats for "/system.slice/docker.service": failed to get container info for "/system.slice/docker.service": unknown container "/system.slice/docker.service" -----------------
Is there a workaround to silence these error messages? They cause a lot of noise in the logs.
Getting the same issue with my deployment of 3.10.34: Sep 21 13:28:00 tmor-master.x.x.x atomic-openshift-node[83809]: E0921 13:28:00.786348 83809 summary.go:102] Failed to get system container stats for "/system.slice/atomic-openshift-node.service": failed to get cgroup stats for "/system.slice/atomic-openshift-node.service": failed to get container info for "/system.slice/atomic-openshift-node.service": unknown container "/system.slice/atomic-openshift-node.service" Sep 21 13:28:10 tmor-master.x.x.x atomic-openshift-node[83809]: E0921 13:28:10.826562 83809 summary.go:102] Failed to get system container stats for "/system.slice/atomic-openshift-node.service": failed to get cgroup stats for "/system.slice/atomic-openshift-node.service": failed to get container info for "/system.slice/atomic-openshift-node.service": unknown container "/system.slice/atomic-openshift-node.service"
I have the same issue. Is there are any workaround or fix?
Origin PR: https://github.com/openshift/origin/pull/21138
This issue is also in 3.11, any plans to backport the fix to it?
Cloned https://bugzilla.redhat.com/show_bug.cgi?id=1643142 to track 3.11 backport
*** Bug 1643142 has been marked as a duplicate of this bug. ***
Verified to be fixed [root@ip-172-18-6-238 ~]# cat /etc/systemd/system.conf.d/origin-accounting.conf [Manager] DefaultCPUAccounting=yes DefaultMemoryAccounting=yes DefaultBlockIOAccounting=yes [root@ip-172-18-6-238 ~]# oc version oc v3.11.98 kubernetes v1.11.0+d4cacc0 features: Basic-Auth GSSAPI Kerberos SPNEGO Server https://ip-172-18-6-238.ec2.internal:8443 openshift v3.11.69 kubernetes v1.11.0+d4cacc0 [root@ip-172-18-6-238 ~]# cat /etc/redhat-release Red Hat Enterprise Linux Server release 7.6 (Maipo)
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0636
Folks i just find a solution in my environment to this problem. I just edit the /etc/systemd/system.conf and uncomment the line DefaultBlockIOAccounting and set to yes. after reboot my system the problem was solved. environment user@local:# oc version oc v3.11.153 kubernetes v1.11.0+d4cacc0 features: Basic-Auth GSSAPI Kerberos SPNEGO user@local:~# cat /etc/redhat-release Red Hat Enterprise Linux Server release 7.7 (Maipo) Hope this information help you guys.