Created attachment 1777314 [details] Grafana showing gaps in data Description of problem: Reviewing pod metrics to see any issues with some testing we are running and found gaps in collected metrics mainly off worker nodes however it seems master nodes also have some gaps in metrics. Version-Release number of selected component (if applicable): 4.8.0.fc.1 How reproducible: If we rebuild the cluster I will recheck if the issue shows up again Steps to Reproduce: 1. Deploy Bare Metal cluster from Assisted Installer (unclear if assisted installer has anything to do with it but just mentioning how we built this cluster) and review metrics (Ex container_memory_working_set_bytes in prometheus) 2. 3. Actual results: View gaps in data (Included screenshots show prometheus and grafana gap artifacts) Expected results: Additional info: Initial viewing of the logs only found a few errors off the kubelets that seem to suggest problems with cadvisor getting metrics. Examples: Apr 29 17:20:33 f19-h03-000-r640 hyperkube[3689]: E0429 17:20:33.611504 3689 cadvisor_stats_provider.go:415] "Partial failure issuing cadvisor.ContainerInfoV2" err="partial failures: [\"/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podab6d0823_1b04_4f1e_86b9_eefc839041a0.slice/crio-9b80aa737caad2147e903311140991cbec9034ed47b2d825d1efb6bb6498a021.scope\": RecentStats: unable to find data in memory cache], [\"/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podab6d0823_1b04_4f1e_86b9_eefc839041a0.slice/crio-eab74d53bad7a5ae49d87844c8139c7f0d871f14be024eff9b39cc99ae47ffe9.scope\": RecentStats: unable to find data in memory cache]" Apr 29 17:20:33 f19-h03-000-r640 hyperkube[3689]: E0429 17:20:33.611599 3689 cadvisor_stats_provider.go:415] "Partial failure issuing cadvisor.ContainerInfoV2" err="partial failures: [\"/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podab6d0823_1b04_4f1e_86b9_eefc839041a0.slice/crio-eab74d53bad7a5ae49d87844c8139c7f0d871f14be024eff9b39cc99ae47ffe9.scope\": RecentStats: unable to find data in memory cache], [\"/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podab6d0823_1b04_4f1e_86b9_eefc839041a0.slice/crio-9b80aa737caad2147e903311140991cbec9034ed47b2d825d1efb6bb6498a021.scope\": RecentStats: unable to find data in memory cache]" Apr 29 18:28:04 f19-h03-000-r640 hyperkube[3689]: E0429 18:28:04.807172 3689 cadvisor_stats_provider.go:151] "Unable to fetch pod etc hosts stats" err="failed to get stats failed command 'du' ($ nice -n 19 du -x -s -B 1) on path /var/lib/kubelet/pods/6c49e7df-ffa3-462c-8bb5-45cda6c3d0c0/etc-hosts with error exit status 1" pod="openshift-dns/node-resolver-4z578" Apr 29 18:28:14 f19-h03-000-r640 hyperkube[3689]: E0429 18:28:14.981119 3689 cadvisor_stats_provider.go:151] "Unable to fetch pod etc hosts stats" err="failed to get stats failed command 'du' ($ nice -n 19 du -x -s -B 1) on path /var/lib/kubelet/pods/6c49e7df-ffa3-462c-8bb5-45cda6c3d0c0/etc-hosts with error exit status 1" pod="openshift-dns/node-resolver-4z578"
Created attachment 1777315 [details] Prometheus showing gaps in data
This looks very similar to bug 1950993 (see comment [1]). Closing as a DUPLICATE, feel free to reopen if you disagree. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1950993#c4 *** This bug has been marked as a duplicate of bug 1950993 ***