Bug 2080253 - deleted pods generate log spam on nodes (cadvisor_stats_provider.go:146] "Unable to fetch pod log stats")
Summary: deleted pods generate log spam on nodes (cadvisor_stats_provider.go:146] "Una...
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node
Version: 4.10
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: ---
: 4.13.0
Assignee: Peter Hunt
QA Contact: Sunil Choudhary
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-04-29 11:14 UTC by Andrei Neagoe
Modified: 2023-01-05 20:17 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-01-05 20:17:06 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github kubernetes kubernetes pull 108855 0 None open kubelet/stats: deduplicate makePodStorageStats 2022-05-16 17:05:15 UTC

Description Andrei Neagoe 2022-04-29 11:14:31 UTC
Description of problem:
It "seems" that deleted pods are not properly reaped on nodes. The observed behavior is that after deletion, we see the pods added to a list of recurring events that keeps spamming the logs, which looks like this:

Apr 29 08:46:43 worker-01.okdtest.example.com hyperkube[1823]: E0429 08:46:43.005055    1823 cadvisor_stats_provider.go:146] "Unable to fetch pod log stats" err="open /var/log/pods/openshift-multus_ip-reconciler-27519450-mwwrd_f25de9a9-c6a7-423d-93e9-cdd3c7910868: no such file or directory" pod="openshift-multus/ip-reconciler-27519450-mwwrd"
Apr 29 08:46:43 worker-01.okdtest.example.com hyperkube[1823]: E0429 08:46:43.005072    1823 cadvisor_stats_provider.go:146] "Unable to fetch pod log stats" err="open /var/log/pods/openshift-operator-lifecycle-manager_collect-profiles-27519390-lqz2x_73db1aae-8cd8-47f9-b1e0-84d190298903: no such file or directory" pod="openshift-operator-lifecycle-manager/collect-profiles-27519390-lqz2x"
Apr 29 08:46:43 worker-01.okdtest.example.com hyperkube[1823]: E0429 08:46:43.005088    1823 cadvisor_stats_provider.go:146] "Unable to fetch pod log stats" err="open /var/log/pods/openshift-multus_ip-reconciler-27520170-2cn2p_1aff54f0-4ae4-4de4-960d-2e65bcd6f755: no such file or directory" pod="openshift-multus/ip-reconciler-27520170-2cn2p"
Apr 29 08:46:43 worker-01.okdtest.example.com hyperkube[1823]: E0429 08:46:43.005102    1823 cadvisor_stats_provider.go:146] "Unable to fetch pod log stats" err="open /var/log/pods/openshift-multus_ip-reconciler-27519855-q7hpx_ffb55c6b-39da-4b0a-be81-dc7980b98265: no such file or directory" pod="openshift-multus/ip-reconciler-27519855-q7hpx"

As pods come and go, the list continues to grow. These events are reported about 14-15 times per minute, causing excessive log spam. 


Version-Release number of selected component (if applicable):
---
4.10.0-0.okd-2022-04-23-131357
---
crio version 1.23.2
Version:          1.23.2
GitCommit:        c0b2474b80fd0844b883729bda88961bed7b472b
GitTreeState:     clean
BuildDate:        2022-04-14T15:21:44Z
GoVersion:        go1.17.5
Compiler:         gc
Platform:         linux/amd64
Linkmode:         dynamic
BuildTags:        seccomp, selinux
SeccompEnabled:   true
AppArmorEnabled:  false
---

How reproducible:
Always

Steps to Reproduce:
1. Delete a running pod
2. Observe logs on node that was running pod

Actual results:


Expected results:


Additional info:
must-gather available at https://drive.google.com/file/d/1_uzVOyTJp50VZXzRVa2c-413771QJfN0/view?usp=sharing

Comment 1 Andrei Neagoe 2022-04-29 11:17:51 UTC
A node reboot clears the current list of events but as pods get created/destroyed on a node, the list builds up again.

Comment 2 Andrei Neagoe 2022-04-29 11:27:38 UTC
I've also found an upstream ticket about what we're seeing on this version: https://github.com/kubernetes/kubernetes/issues/106957


Note You need to log in before you can comment on or make changes to this bug.