Bug 1741608 - logs are flooded with "failed to get pod stats: failed to get imageFs info: non-existent label "crio-images"" messages
Summary: logs are flooded with "failed to get pod stats: failed to get imageFs info: n...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node
Version: 4.1.z
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
: 4.1.z
Assignee: Seth Jennings
QA Contact: MinLi
URL:
Whiteboard:
Depends On: 1710626
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-08-15 14:44 UTC by Suresh
Modified: 2020-08-04 17:38 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Fixes issue where the kubelet would start before cri-o, resulting in misleading "non-existent label crio-images" messages in the kubelet logs.
Clone Of:
Environment:
Last Closed: 2020-07-27 19:45:21 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift machine-config-operator pull 1118 0 'None' closed Bug 1741608: logs are flooded with "failed to get pod stats: failed to get imageFs info: non-existent label "crio-images... 2021-01-21 02:33:46 UTC
Github openshift machine-config-operator pull 1119 0 'None' closed Bug 1741608: logs are flooded with "failed to get pod stats: failed to get imageFs info: non-existent label "crio-images... 2021-01-21 02:33:48 UTC

Description Suresh 2019-08-15 14:44:25 UTC
Description of problem:
The cluster upgraded to 4.1.9 recently. 
Sometime after our upgrade, one of the messages that is a constant re-occurrence in the UI console is the following message:

Generated from kubelet on the worker node.

228 times in the last 19 hours
failed to get imageFs info: non-existent label "crio-images"

Looking at journalctl I see the following messages:
Aug 13 13:53:06 worker.example.com hyperkube[901]: ,StartedAt:2019-08-12 18:38:06 +0000 UTC,FinishedAt:2019-08-12 18:55:30 +0000 UTC,ContainerID:cri-o://fc9f90e123ca14e4d000c9d4c854bb54d43ce96b1624d51bb4b97afd1c728229,}} Ready:true RestartCount:1 Image:quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:472dd90bc413a9bcb99be23f7296763468ebbeb985c10b26d1c44c4b04f57a77 ImageID:quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:472dd90bc413a9bcb99be23f7296763468ebbeb985c10b26d1c44c4b04f57a77 ContainerID:cri-o://d00c7f3b7c086a5f1930b1006d8f2814f0fae5ee230fd75c37e6605a57e52dee}] QOSClass:Burstable}
Aug 13 13:53:06 worker.example.com hyperkube[901]: I0813 13:53:06.672782     901 volume_manager.go:350] Waiting for volumes to attach and mount for pod "ovs-tk8cx_openshift-sdn(52acd2f6-bd30-11e9-89f9-00505689cbdc)"
Aug 13 13:53:06 worker.example.com hyperkube[901]: I0813 13:53:06.672886     901 volume_manager.go:383] All volumes are attached and mounted for pod "ovs-tk8cx_openshift-sdn(52acd2f6-bd30-11e9-89f9-00505689cbdc)"
Aug 13 13:53:06 worker.example.com hyperkube[901]: I0813 13:53:06.677264     901 kuberuntime_manager.go:585] computePodActions got {KillPod:false CreateSandbox:false SandboxID:3376e3feae37010d139893a1b1beef861eac5acfe6887ce79fde6ffc879b4154 Attempt:1 NextInitContainerToStart:nil ContainersToStart:[] ContainersToKill:map[]} for pod "ovs-tk8cx_openshift-sdn(52acd2f6-bd30-11e9-89f9-00505689cbdc)"
Aug 13 13:53:06 worker.example.com hyperkube[901]: I0813 13:53:06.764355     901 secret.go:182] Setting up volume sdn-token-djmfl for pod 52acd2f6-bd30-11e9-89f9-00505689cbdc at /var/lib/kubelet/pods/52acd2f6-bd30-11e9-89f9-00505689cbdc/volumes/kubernetes.io~secret/sdn-token-djmfl
Aug 13 13:53:06 worker.example.com hyperkube[901]: I0813 13:53:06.768333     901 secret.go:206] Received secret openshift-sdn/sdn-token-djmfl containing (4) pieces of data, 13733 total bytes
Aug 13 13:53:07 worker.example.com hyperkube[901]: I0813 13:53:07.228179     901 prober.go:118] Liveness probe for "ovs-tk8cx_openshift-sdn(52acd2f6-bd30-11e9-89f9-00505689cbdc):openvswitch" succeeded
Aug 13 13:53:08 worker.example.com hyperkube[901]: E0813 13:53:08.019578     901 log_metrics.go:59] failed to get pod stats: failed to get imageFs info: non-existent label "crio-images"
Aug 13 13:53:08 worker.example.com hyperkube[901]: E0813 13:53:08.021206     901 log_metrics.go:59] failed to get pod stats: failed to get imageFs info: non-existent label "crio-images"
Aug 13 13:53:08 worker.example.com hyperkube[901]: I0813 13:53:08.028948     901 server.go:818] GET /metrics: (10.192922ms) 200 [Prometheus/2.7.2 192.168.1.1:58362]
Aug 13 13:53:08 worker.example.com hyperkube[901]: I0813 13:53:08.029593     901 server.go:818] GET /metrics: (8.847761ms) 200 [Prometheus/2.7.2 192.168.1.1:46540]

There doesn't appear to be visible impact really - other than it is causing some spam in the logs

This is possibly related to https://bugzilla.redhat.com/show_bug.cgi?id=1700779 where these messages were discussed.

Version-Release number of selected component (if applicable):
Client Version: version.Info{Major:"4", Minor:"1+", GitVersion:"v4.1.9-201907311355+45479a5-dirty", GitCommit:"45479a5", GitTreeState:"dirty", BuildDate:"2019-07-31T21:33:07Z", GoVersion:"go1.11.6", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"13+", GitVersion:"v1.13.4+8560dd6", GitCommit:"8560dd6", GitTreeState:"clean", BuildDate:"2019-07-31T21:30:54Z", GoVersion:"go1.11.6", Compiler:"gc", Platform:"linux/amd64"}

Comment 15 MinLi 2020-05-28 09:54:51 UTC
verified!

$ oc get clusterversion 
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.1.0-0.nightly-2020-05-27-042422   True        False         7h4m    Cluster version is 4.1.0-0.nightly-2020-05-27-042422

sh-4.4# cat /etc/systemd/system/kubelet.service
[Unit]
Description=Kubernetes Kubelet
Wants=rpc-statd.service network-online.target crio.service
After=network-online.target crio.service


Note You need to log in before you can comment on or make changes to this bug.