Description of problem: While running load 64 workload pods with guaranteed resources (50m CPU requests and limits, 100 Mib memory requests and limits) on Single Node Openshift with DU profile, several pods went in createContainerError state: [root@e32-h15-000-r750 logs]# oc get pods -A | grep boatload | grep -i createContainerError | wc -l 48 Events from pod describe: Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 83m default-scheduler Successfully assigned boatload-25/boatload-25-1-boatload-68dc44c494-jdqtr to e32-h17-000-r750.stage.rdu2.scalelab.redhat.com Normal AddedInterface 79m multus Add eth0 [10.128.1.163/23] from ovn-kubernetes Warning Failed 58m kubelet Error: Kubelet may be retrying requests that are timing out in CRI-O due to system load: context deadline exceeded: error reserving ctr name k8s_boatload-1_boatload-25-1-boatload-68dc44c494-jdqtr_boatload-25_d11b5ed8-6378-4e67-8452-f6d388b55f7f_6 for id 05864ebfd972fd00ec5a1fe8f0212fcc61f2168f00a14e32f9a21ff60ee67188: name is reserved Normal Pulled 46m (x16 over 79m) kubelet Container image "quay.io/redhat-performance/test-gohttp-probe:v0.0.2" already present on machine Warning Failed 4m17s (x32 over 77m) kubelet Error: context deadline exceeded These pods seem stuck in the same control plane Container creation loop: Apr 12 16:20:36 e32-h17-000-r750.stage.rdu2.scalelab.redhat.com bash[8587]: time="2022-04-12 16:20:36.942980216Z" level=info msg="Got pod network &{Name:boatload-25-1-boatload-68dc44c494-jdqtr Namespace:boatload-25 ID:356d5d406da1ef416e8$c61a516862dcb187904abe8960a703955d4ba2782761 UID:d11b5ed8-6378-4e67-8452-f6d388b55f7f NetNS:/var/run/netns/f47a544c-f1e0-4c6a-9c96-0ffd2b5c9607 Networks:[] RuntimeConfig:map[multus-cni-network:{IP: MAC: PortMappings:[] Bandwidth:<nil> Ip$anges:[]}] Aliases:map[]}" Apr 12 16:20:36 e32-h17-000-r750.stage.rdu2.scalelab.redhat.com bash[8587]: time="2022-04-12 16:20:36.943570182Z" level=info msg="Checking pod boatload-25_boatload-25-1-boatload-68dc44c494-jdqtr for CNI network multus-cni-network (type=m$ltus)" Apr 12 16:20:42 e32-h17-000-r750.stage.rdu2.scalelab.redhat.com bash[8587]: time="2022-04-12 16:20:42.661692430Z" level=info msg="Ran pod sandbox 356d5d406da1ef416e8bc61a516862dcb187904abe8960a703955d4ba2782761 with infra container: boat$oad-25/boatload-25-1-boatload-68dc44c494-jdqtr/POD" id=f42ed5cc-f99b-43f6-b161-c0efdd096c10 name=/runtime.v1.RuntimeService/RunPodSandbox Apr 12 16:20:42 e32-h17-000-r750.stage.rdu2.scalelab.redhat.com bash[8696]: I0412 16:20:42.890978 8696 memory_manager.go:230] "Memory affinity" pod="boatload-25/boatload-25-1-boatload-68dc44c494-jdqtr" containerName="boatload-1" numaN$ des=map[0:{}] Apr 12 16:20:42 e32-h17-000-r750.stage.rdu2.scalelab.redhat.com bash[8587]: time="2022-04-12 16:20:42.899581201Z" level=info msg="Creating container: boatload-25/boatload-25-1-boatload-68dc44c494-jdqtr/boatload-1" id=f8556b99-8a6f-4cdb-8$ c4-846e02769f65 name=/runtime.v1.RuntimeService/CreateContainer Apr 12 16:20:46 e32-h17-000-r750.stage.rdu2.scalelab.redhat.com bash[8696]: I0412 16:20:46.712492 8696 kubelet.go:2124] "SyncLoop (PLEG): event for pod" pod="boatload-25/boatload-25-1-boatload-68dc44c494-jdqtr" event=&{ID:d11b5ed8-6378 -4e67-8452-f6d388b55f7f Type:ContainerStarted Data:356d5d406da1ef416e8bc61a516862dcb187904abe8960a703955d4ba2782761} Apr 12 16:22:07 e32-h17-000-r750.stage.rdu2.scalelab.redhat.com bash[8696]: I0412 16:22:07.482879 8696 kubelet.go:2093] "SyncLoop UPDATE" source="api" pods=[boatload-25/boatload-25-1-boatload-68dc44c494-jdqtr] Apr 12 16:22:42 e32-h17-000-r750.stage.rdu2.scalelab.redhat.com bash[8696]: E0412 16:22:42.916960 8696 kuberuntime_manager.go:919] container &Container{Name:boatload-1,Image:quay.io/redhat-performance/test-gohttp-probe:v0.0.2,Command:[ ],Args:[],WorkingDir:,Ports:[]ContainerPort{ContainerPort{Name:,HostPort:0,ContainerPort:8000,Protocol:TCP,HostIP:,},},Env:[]EnvVar{EnvVar{Name:PORT,Value:8000,ValueFrom:nil,},EnvVar{Name:LISTEN_DELAY_SECONDS,Value:0,ValueFrom:nil,},EnvVa r{Name:LIVENESS_DELAY_SECONDS,Value:0,ValueFrom:nil,},EnvVar{Name:READINESS_DELAY_SECONDS,Value:0,ValueFrom:nil,},EnvVar{Name:RESPONSE_DELAY_MILLISECONDS,Value:0,ValueFrom:nil,},EnvVar{Name:LIVENESS_SUCCESS_MAX,Value:0,ValueFrom:nil,},Env Var{Name:READINESS_SUCCESS_MAX,Value:0,ValueFrom:nil,},},Resources:ResourceRequirements{Limits:ResourceList{cpu: {{50 -3} {<nil>} 50m DecimalSI},memory: {{104857600 0} {<nil>} 100Mi BinarySI},},Requests:ResourceList{cpu: {{50 -3} {<nil>} 50m DecimalSI},memory: {{104857600 0} {<nil>} 100Mi BinarySI},},},VolumeMounts:[]VolumeMount{VolumeMount{Name:kube-api-access-s4mdb,ReadOnly:true,MountPath:/var/run/secrets/kubernetes.io/serviceaccount,SubPath:,MountPropagation:nil,SubPat hExpr:,},},LivenessProbe:nil,ReadinessProbe:nil,Lifecycle:nil,TerminationMessagePath:/dev/termination-log,ImagePullPolicy:IfNotPresent,SecurityContext:&SecurityContext{Capabilities:&Capabilities{Add:[],Drop:[KILL MKNOD SETGID SETUID],},Pr ivileged:nil,SELinuxOptions:nil,RunAsUser:*1002360000,RunAsNonRoot:nil,ReadOnlyRootFilesystem:nil,AllowPrivilegeEscalation:nil,RunAsGroup:nil,ProcMount:nil,WindowsOptions:nil,SeccompProfile:nil,},Stdin:false,StdinOnce:false,TTY:false,EnvF rom:[]EnvFromSource{},TerminationMessagePolicy:File,VolumeDevices:[]VolumeDevice{},StartupProbe:nil,} start failed in pod boatload-25-1-boatload-68dc44c494-jdqtr_boatload-25(d11b5ed8-6378-4e67-8452-f6d388b55f7f): CreateContainerError: con text deadline exceeded Apr 12 16:22:42 e32-h17-000-r750.stage.rdu2.scalelab.redhat.com bash[8696]: E0412 16:22:42.917641 8696 pod_workers.go:949] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"boatload-1\" with CreateContainerError: \" context deadline exceeded\"" pod="boatload-25/boatload-25-1-boatload-68dc44c494-jdqtr" podUID=d11b5ed8-6378-4e67-8452-f6d388b55f7f Apr 12 16:22:43 e32-h17-000-r750.stage.rdu2.scalelab.redhat.com bash[8696]: I0412 16:22:43.367147 8696 memory_manager.go:230] "Memory affinity" pod="boatload-25/boatload-25-1-boatload-68dc44c494-jdqtr" containerName="boatload-1" numaNo des=map[0:{}] Apr 12 16:22:43 e32-h17-000-r750.stage.rdu2.scalelab.redhat.com bash[8587]: time="2022-04-12 16:22:43.384482697Z" level=info msg="Creating container: boatload-25/boatload-25-1-boatload-68dc44c494-jdqtr/boatload-1" id=d4f53103-fa96-47d0-8d 21-ca44a0661534 name=/runtime.v1.RuntimeService/CreateContainer Apr 12 16:24:43 e32-h17-000-r750.stage.rdu2.scalelab.redhat.com bash[8696]: E0412 16:24:43.373770 8696 kuberuntime_manager.go:919] container &Container{Name:boatload-1,Image:quay.io/redhat-performance/test-gohttp-probe:v0.0.2,Command:[ ],Args:[],WorkingDir:,Ports:[]ContainerPort{ContainerPort{Name:,HostPort:0,ContainerPort:8000,Protocol:TCP,HostIP:,},},Env:[]EnvVar{EnvVar{Name:PORT,Value:8000,ValueFrom:nil,},EnvVar{Name:LISTEN_DELAY_SECONDS,Value:0,ValueFrom:nil,},EnvVa r{Name:LIVENESS_DELAY_SECONDS,Value:0,ValueFrom:nil,},EnvVar{Name:READINESS_DELAY_SECONDS,Value:0,ValueFrom:nil,},EnvVar{Name:RESPONSE_DELAY_MILLISECONDS,Value:0,ValueFrom:nil,},EnvVar{Name:LIVENESS_SUCCESS_MAX,Value:0,ValueFrom:nil,},Env Var{Name:READINESS_SUCCESS_MAX,Value:0,ValueFrom:nil,},},Resources:ResourceRequirements{Limits:ResourceList{cpu: {{50 -3} {<nil>} 50m DecimalSI},memory: {{104857600 0} {<nil>} 100Mi BinarySI},},Requests:ResourceList{cpu: {{50 -3} {<nil>} 50m DecimalSI},memory: {{104857600 0} {<nil>} 100Mi BinarySI},},},VolumeMounts:[]VolumeMount{VolumeMount{Name:kube-api-access-s4mdb,ReadOnly:true,MountPath:/var/run/secrets/kubernetes.io/serviceaccount,SubPath:,MountPropagation:nil,SubPat hExpr:,},},LivenessProbe:nil,ReadinessProbe:nil,Lifecycle:nil,TerminationMessagePath:/dev/termination-log,ImagePullPolicy:IfNotPresent,SecurityContext:&SecurityContext{Capabilities:&Capabilities{Add:[],Drop:[KILL MKNOD SETGID SETUID],},Pr ivileged:nil,SELinuxOptions:nil,RunAsUser:*1002360000,RunAsNonRoot:nil,ReadOnlyRootFilesystem:nil,AllowPrivilegeEscalation:nil,RunAsGroup:nil,ProcMount:nil,WindowsOptions:nil,SeccompProfile:nil,},Stdin:false,StdinOnce:false,TTY:false,EnvF rom:[]EnvFromSource{},TerminationMessagePolicy:File,VolumeDevices:[]VolumeDevice{},StartupProbe:nil,} start failed in pod boatload-25-1-boatload-68dc44c494-jdqtr_boatload-25(d11b5ed8-6378-4e67-8452-f6d388b55f7f): CreateContainerError: con text deadline exceeded Apr 12 16:24:43 e32-h17-000-r750.stage.rdu2.scalelab.redhat.com bash[8696]: E0412 16:24:43.373999 8696 pod_workers.go:949] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"boatload-1\" with CreateContainerError: \" context deadline exceeded\"" pod="boatload-25/boatload-25-1-boatload-68dc44c494-jdqtr" podUID=d11b5ed8-6378-4e67-8452-f6d388b55f7f Additionally, this is noted: Few entries from "oc get events -A | grep FailedMount" kube-api-access-fndn8" : failed to fetch token: pod "boatload-64-1-boatload-855dd674ff-wkm59" not found boatload-64 82m Warning FailedMount pod/boatload-64-1-boatload-855dd674ff-wkm59 MountVolume.SetUp failed for volume "kube-api-access-fndn8" : [failed to fetch token: pod "boatload-64-1-boatload-855dd674ff-wkm59" not found, object "boatload-64"/"kube-root-ca.crt" not registered, object "boatload-64"/"openshift-service-ca.crt" not registered] boatload-7 3m4s Warning FailedMount pod/boatload-7-1-boatload-69fb8c444-qxnrm MountVolume.SetUp failed for volume "kube-api-access-pqsqq" : failed to fetch token: pod "boatload-7-1-boatload-69fb8c444-qxnrm" not found boatload-7 84m Warning FailedMount pod/boatload-7-1-boatload-69fb8c444-qxnrm MountVolume.SetUp failed for volume "kube-api-access-pqsqq" : [failed to fetch token: pod "boatload-7-1-boatload-69fb8c444-qxnrm" not found, object "boatload-7"/"kube-root-ca.crt" not registered, object "boatload-7"/"openshift-service-ca.crt" not registered] boatload-8 3m4s Warning FailedMount pod/boatload-8-1-boatload-74494575dc-shgv9 MountVolume.SetUp failed for volume "kube-api-access-zg4k9" : failed to fetch token: pod "boatload-8-1-boatload-74494575dc-shgv9" not found boatload-9 63s Warning FailedMount pod/boatload-9-1-boatload-55948f8bf8-trkfx MountVolume.SetUp failed for volume "kube-api-access-bxlnk" : failed to fetch token: pod "boatload-9-1-boatload-55948f8bf8-trkfx" not found boatload-9 84m Warning FailedMount pod/boatload-9-1-boatload-55948f8bf8-trkfx MountVolume.SetUp failed for volume "kube-api-access-bxlnk" : [failed to fetch token: pod "boatload-9-1-boatload-55948f8bf8-trkfx" not found, object "boatload-9"/"kube-root-ca.crt" not registered, object "boatload-9"/"openshift-service-ca.crt" not registered] None of these pods are part of the current workload run even though the errors are current. Filtering with one of the pods from the above results: [core@e32-h17-000-r750 ~]$ journalctl --since "1 hour ago" | grep boatload-9-1-boatload-55948f8bf8-trkfx | grep "ountVolume.SetUp failed" Apr 12 17:06:16 e32-h17-000-r750.stage.rdu2.scalelab.redhat.com bash[8696]: E0412 17:06:16.018516 8696 nestedpendingoperations.go:335] Operation for "{volumeName:kubernetes.io/projected/0f90464c-085b-4d17-90c3-f637b5854e48-kube-api-access-bxlnk podName:0f90464c-085b-4d17-90c3-f637b5854e48 nodeName:}" failed. No retries permitted until 2022-04-12 17:08:18.01849196 +0000 UTC m=+680136.354143944 (durationBeforeRetry 2m2s). Error: MountVolume.SetUp failed for volume "kube-api-access-bxlnk" (UniqueName: "kubernetes.io/projected/0f90464c-085b-4d17-90c3-f637b5854e48-kube-api-access-bxlnk") pod "boatload-9-1-boatload-55948f8bf8-trkfx" (UID: "0f90464c-085b-4d17-90c3-f637b5854e48") : failed to fetch token: pod "boatload-9-1-boatload-55948f8bf8-trkfx" not found Filtering with one of the pods stuck in createContainerError (ns boatload-25) (from the current workload run) [core@e32-h17-000-r750 ~]$ journalctl --since "1 hour ago" | grep boatload-25-1-boatload-68dc44c494-jdqtr | grep "MountVolume.SetUp failed" No results This is showing pods from previously deployed ns with same name "boatload-25" [core@e32-h17-000-r750 ~]$ journalctl --since "2022-04-12 16:20:42" --until "2022-04-12 16:22:42" | grep boatload-25 [core@e32-h17-000-r750 ~]$ journalctl --since "1 hour ago" | grep boatload-25 Apr 12 16:47:56 e32-h17-000-r750.stage.rdu2.scalelab.redhat.com bash[8696]: E0412 16:47:56.706281 8696 projected.go:199] Error preparing data for projected volume kube-api-access-d9rkn for pod boatload-25/boatload-25-1-boatload-5b9d59$ dd4-75w8l: failed to fetch token: pod "boatload-25-1-boatload-5b9d59cdd4-75w8l" not found Apr 12 16:47:56 e32-h17-000-r750.stage.rdu2.scalelab.redhat.com bash[8696]: E0412 16:47:56.706335 8696 nestedpendingoperations.go:335] Operation for "{volumeName:kubernetes.io/projected/01e3df66-55c0-4ceb-a78d-235b8fea8c9c-kube-api-ac$ ess-d9rkn podName:01e3df66-55c0-4ceb-a78d-235b8fea8c9c nodeName:}" failed. No retries permitted until 2022-04-12 16:49:58.706319733 +0000 UTC m=+679037.041971704 (durationBeforeRetry 2m2s). Error: MountVolume.SetUp failed for volume "kub$ -api-access-d9rkn" (UniqueName: "kubernetes.io/projected/01e3df66-55c0-4ceb-a78d-235b8fea8c9c-kube-api-access-d9rkn") pod "boatload-25-1-boatload-5b9d59cdd4-75w8l" (UID: "01e3df66-55c0-4ceb-a78d-235b8fea8c9c") : failed to fetch token: po$ "boatload-25-1-boatload-5b9d59cdd4-75w8l" not found Apr 12 16:47:56 e32-h17-000-r750.stage.rdu2.scalelab.redhat.com bash[8696]: E0412 16:47:56.729608 8696 projected.go:199] Error preparing data for projected volume kube-api-access-x4977 for pod boatload-25/boatload-25-1-boatload-68dc44c 494-g47qj: failed to fetch token: pod "boatload-25-1-boatload-68dc44c494-g47qj" not found Apr 12 16:47:56 e32-h17-000-r750.stage.rdu2.scalelab.redhat.com bash[8696]: E0412 16:47:56.732127 8696 nestedpendingoperations.go:335] Operation for "{volumeName:kubernetes.io/projected/ab2dcb4f-59b0-4438-be49-38b8876777c8-kube-api-acc ess-x4977 podName:ab2dcb4f-59b0-4438-be49-38b8876777c8 nodeName:}" failed. No retries permitted until 2022-04-12 16:49:58.732111976 +0000 UTC m=+679037.067763951 (durationBeforeRetry 2m2s). Error: MountVolume.SetUp failed for volume "kube -api-access-x4977" (UniqueName: "kubernetes.io/projected/ab2dcb4f-59b0-4438-be49-38b8876777c8-kube-api-access-x4977") pod "boatload-25-1-boatload-68dc44c494-g47qj" (UID: "ab2dcb4f-59b0-4438-be49-38b8876777c8") : failed to fetch token: pod "boatload-25-1-boatload-68dc44c494-g47qj" not found Apr 12 16:47:56 e32-h17-000-r750.stage.rdu2.scalelab.redhat.com bash[8696]: E0412 16:47:56.933145 8696 projected.go:199] Error preparing data for projected volume kube-api-access-5t97v for pod boatload-25/boatload-25-1-boatload-68dc44c 494-qg74h: failed to fetch token: pod "boatload-25-1-boatload-68dc44c494-qg74h" not found Apr 12 16:47:56 e32-h17-000-r750.stage.rdu2.scalelab.redhat.com bash[8696]: E0412 16:47:56.954693 8696 nestedpendingoperations.go:335] Operation for "{volumeName:kubernetes.io/projected/04867431-3220-4d9e-aa41-63a80063cc8f-kube-api-acc ess-5t97v podName:04867431-3220-4d9e-aa41-63a80063cc8f nodeName:}" failed. No retries permitted until 2022-04-12 16:49:58.954655992 +0000 UTC m=+679037.290307962 (durationBeforeRetry 2m2s). Error: MountVolume.SetUp failed for volume "kube -api-access-5t97v" (UniqueName: "kubernetes.io/projected/04867431-3220-4d9e-aa41-63a80063cc8f-kube-api-access-5t97v") pod "boatload-25-1-boatload-68dc44c494-qg74h" (UID: "04867431-3220-4d9e-aa41-63a80063cc8f") : failed to fetch token: pod "boatload-25-1-boatload-68dc44c494-qg74h" not found Apr 12 16:48:39 e32-h17-000-r750.stage.rdu2.scalelab.redhat.com bash[8696]: E0412 16:48:39.826724 8696 kuberuntime_manager.go:919] container &Container{Name:boatload-1,Image:quay.io/redhat-performance/test-gohttp-probe:v0.0.2,Command:[ ],Args:[],WorkingDir:,Ports:[]ContainerPort{ContainerPort{Name:,HostPort:0,ContainerPort:8000,Protocol:TCP,HostIP:,},},Env:[]EnvVar{EnvVar{Name:PORT,Value:8000,ValueFrom:nil,},EnvVar{Name:LISTEN_DELAY_SECONDS,Value:0,ValueFrom:nil,},EnvVa r{Name:LIVENESS_DELAY_SECONDS,Value:0,ValueFrom:nil,},EnvVar{Name:READINESS_DELAY_SECONDS,Value:0,ValueFrom:nil,},EnvVar{Name:RESPONSE_DELAY_MILLISECONDS,Value:0,ValueFrom:nil,},EnvVar{Name:LIVENESS_SUCCESS_MAX,Value:0,ValueFrom:nil,},Env Var{Name:READINESS_SUCCESS_MAX,Value:0,ValueFrom:nil,},},Resources:ResourceRequirements{Limits:ResourceList{cpu: {{50 -3} {<nil>} 50m DecimalSI},memory: {{104857600 0} {<nil>} 100Mi BinarySI},},Requests:ResourceList{cpu: {{50 -3} {<nil>} 50m DecimalSI},memory: {{104857600 0} {<nil>} 100Mi BinarySI},},},VolumeMounts:[]VolumeMount{VolumeMount{Name:kube-api-access-s4mdb,ReadOnly:true,MountPath:/var/run/secrets/kubernetes.io/serviceaccount,SubPath:,MountPropagation:nil,SubPat hExpr:,},},LivenessProbe:nil,ReadinessProbe:nil,Lifecycle:nil,TerminationMessagePath:/dev/termination-log,ImagePullPolicy:IfNotPresent,SecurityContext:&SecurityContext{Capabilities:&Capabilities{Add:[],Drop:[KILL MKNOD SETGID SETUID],},Pr ivileged:nil,SELinuxOptions:nil,RunAsUser:*1002360000,RunAsNonRoot:nil,ReadOnlyRootFilesystem:nil,AllowPrivilegeEscalation:nil,RunAsGroup:nil,ProcMount:nil,WindowsOptions:nil,SeccompProfile:nil,},Stdin:false,StdinOnce:false,TTY:false,EnvF rom:[]EnvFromSource{},TerminationMessagePolicy:File,VolumeDevices:[]VolumeDevice{},StartupProbe:nil,} start failed in pod boatload-25-1-boatload-68dc44c494-jdqtr_boatload-25(d11b5ed8-6378-4e67-8452-f6d388b55f7f): CreateContainerError: context deadline exceeded It appears two separate issues are goin on at the same time, one being container creation resulting in Context deadline exceeded error probably due to CRI-O overload and second being an issue with FailedMounts errors recorded for previously run containers Must-gather logs could not be collected due to collect-profiles pod resulting in createContainerError state Version-Release number of selected component (if applicable): OCP : 4.10.6 Kernel: 4.18.0-305.40.2.rt7.113.el8_4.x86_64 CRIO: 1.23.2-2.rhaos4.10.git071ae78.el8 How reproducible: Tried several iterations of 64/74 guaranteed pods on this SNO, resulted in same issue (createContainerErrors) almost every time. Noticed the FailedMount issue only in the last few runs. Steps to Reproduce: 1. 2. 3. Actual results: Pods in CreateContainerError state. FailedMount errors from previous pod executions. Expected results: Pods to run successfully. No MountVolume setup failed errors. Additional info:
on re-logging into the node, discovered this: [systemd] Failed Units: 8 crio-0967d803ff70ec27e4d4b815fc3729b7fac8230e45c2d7e47466902258ae6802.scope crio-9b80393d0ddf73243fa80e4da21dce538cc09f5c0a4de8f642f9ce1c557b1f8e.scope crio-b0e9b67d254b87abc6cf6160a18182d3baa535cd8864521d108c847a55d17dea.scope crio-b1f5ca9331f7965aea6567442c2458c28543b540e8fff5e558dd11732494d75d.scope crio-ef78cfb23f13cb249bd8b373370e4d44b6d93291cc2bc9e806a7c669b3f389cb.scope crio-ff9c0cd949d3108cfd62a9133f291b9e392342bd1c537ec44c5cafba8e01e6fb.scope NetworkManager-dispatcher.service NetworkManager-wait-online.service [core@e32-h17-000-r750 ~]$ systemctl status crio-0967d803ff70ec27e4d4b815fc3729b7fac8230e45c2d7e47466902258ae6802.scope ● crio-0967d803ff70ec27e4d4b815fc3729b7fac8230e45c2d7e47466902258ae6802.scope - libcontainer container 0967d803ff70ec27e4d4b815fc3729b7fac8230e45c2d7e47466902258ae6802 Loaded: loaded (/run/systemd/transient/crio-0967d803ff70ec27e4d4b815fc3729b7fac8230e45c2d7e47466902258ae6802.scope; transient) Transient: yes Active: failed (Result: timeout) since Mon 2022-04-11 18:26:01 UTC; 24h ago Tasks: 5 (limit: 1646556) Memory: 8.1M CPU: 149ms CGroup: /kubepods.slice/kubepods-podb73fed63_ed81_4f9e_bb0f_14e99c345105.slice/crio-0967d803ff70ec27e4d4b815fc3729b7fac8230e45c2d7e47466902258ae6802.scope └─541960 /bin/runc init
Load average is pretty high on the node with workload still running: [core@e32-h17-000-r750 ~]$ uptime 19:04:15 up 7 days, 22:54, 1 user, load average: 160.81, 188.23, 176.55