Bug 1977100
| Summary: | Pod failed to start with message "set CPU load balancing: readdirent /proc/sys/kernel/sched_domain/cpu66/domain0: no such file or directory" | ||||||
|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Ian Miller <imiller> | ||||
| Component: | Node | Assignee: | Peter Hunt <pehunt> | ||||
| Node sub component: | CRI-O | QA Contact: | Sunil Choudhary <schoudha> | ||||
| Status: | CLOSED ERRATA | Docs Contact: | |||||
| Severity: | low | ||||||
| Priority: | medium | CC: | achernet, akamra, akurikal, aos-bugs, cgaynor, dmoessne, dornelas, dosman, eparis, fromani, harpatil, imiller, keyoung, kir, minmli, mzheng, nagrawal, pehunt, shaising, shajmakh, tsweeney | ||||
| Version: | 4.8 | Keywords: | Reopened | ||||
| Target Milestone: | --- | ||||||
| Target Release: | 4.11.0 | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | No Doc Update | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | |||||||
| : | 2091524 2102803 (view as bug list) | Environment: | |||||
| Last Closed: | 2022-08-10 10:36:52 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | |||||||
| Bug Blocks: | 2091524, 2102803 | ||||||
| Attachments: |
|
||||||
|
Description
Ian Miller
2021-06-28 21:56:08 UTC
Artyom can you PTAL? bugzilla is hard Hi Ian, does this issue is persistent in your environment or does it happen only once? Hi Artyom. It only occurred once and has not been seen since. Not completed this sprint. Ian, please feel free to re-open it if you encounter it again. Issue seen again as part of pod create/delete in a loop.
Name: dpdk-testpmd-1
Namespace: default
Priority: 0
Node: cnfocto2.ptp.lab.eng.bos.redhat.com/10.16.231.12
Start Time: Fri, 01 Apr 2022 17:45:51 -0400
Labels: <none>
Annotations: cpu-load-balancing.crio.io: disable
cpu-quota.crio.io: disable
irq-load-balancing.crio.io: disable
....
Status: Failed
....
Message: failed to run pre-start hook for container "a0118c046214e70fdaa2c6216a429941e28737658423a225b5961611f415e5b4": set CPU load balancing: lstat /proc/sys/kernel/sched_domain/cpu22/domain1/flags: no such file or directory
Looks like the kernel recreates sched_domain directories each time a new process needed to be re-balanced(my speculations and maybe I am wrong), but I can see once I am creating a new pod: 1. Create a debug pod for the node and under it run sh-4.4# stat -c '%y' /host/proc/sys/kernel/sched_domain/cpu2/ 2022-04-05 08:17:44.067865761 +0000 2. exit 3. create a new debug pod to the same node and check again the command above stat -c '%y' /host/proc/sys/kernel/sched_domain/cpu2/ 2022-04-05 08:22:41.960812800 +0000 So in general we have a race under the CRI-O between creating the pod and setting the sched_domain values. Occurrences are pretty rare, but we anyway should think about the way to fix them. A very recent attempt to fix this was in https://github.com/cri-o/cri-o/pull/5786, and it went into cri-o v1.24.0. From what I see, it should indeed fix (or reduce the probability of) this issue happening. Since this bug is reported against openshift 4.8, I guess we need to backport the fix to cri-o v1.21. Will do. 1.21 backport: https://github.com/cri-o/cri-o/pull/5919 1.22 backport: https://github.com/cri-o/cri-o/pull/5920 1.23 backport: https://github.com/cri-o/cri-o/pull/5921 % oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.11.0-0.nightly-2022-06-15-222801 True False 47m Cluster version is 4.11.0-0.nightly-2022-06-15-222801
% oc get nodes
NAME STATUS ROLES AGE VERSION
ip-10-0-75-50.us-east-2.compute.internal Ready master,worker 64m v1.24.0+cb71478
% oc debug node/ip-10-0-75-50.us-east-2.compute.internal
Warning: would violate PodSecurity "restricted:latest": host namespaces (hostNetwork=true, hostPID=true), privileged (container "container-00" must not set securityContext.privileged=true), allowPrivilegeEscalation != false (container "container-00" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "container-00" must set securityContext.capabilities.drop=["ALL"]), restricted volume types (volume "host" uses restricted volume type "hostPath"), runAsNonRoot != true (pod or container "container-00" must set securityContext.runAsNonRoot=true), runAsUser=0 (container "container-00" must not set runAsUser=0), seccompProfile (pod or container "container-00" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
Starting pod/ip-10-0-75-50us-east-2computeinternal-debug ...
…
sh-4.4# cat /etc/kubernetes/openshift-workload-pinning
{
"management": {
"cpuset": "0,1"
}
}
sh-4.4# cat /etc/crio/crio.conf.d/01-workload-partitioning
[crio.runtime.workloads.management]
activation_annotation = "target.workload.openshift.io/management"
annotation_prefix = "resources.workload.openshift.io"
resources = { "cpushares" = 0, "cpuset" = "0-1,10-12" }
% cat epod.yaml
apiVersion: v1
kind: Pod
metadata:
name: twocontainers
annotations:
cpu-load-balancing.crio.io: disable
spec:
containers:
- name: sise
image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:aa8d1daf3432d8dedc5c56d94aeb1f25301bce6ccd7d5406fb03a00be97374ad
command:
- "bin/bash"
- "-c"
- "sleep 10000"
resources:
limits:
cpu: “500m”
memory: "500Mi"
requests:
cpu: “400m”
memory: "400Mi"
% oc create -f epod.yaml
Warning: would violate PodSecurity "restricted:latest": allowPrivilegeEscalation != false (container "sise" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "sise" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or container "sise" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "sise" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
pod/twocontainers created
% oc get pods
NAME READY STATUS RESTARTS AGE
twocontainers 1/1 Running 0 3s
% oc describe pod twocontainers
Name: twocontainers
Namespace: default
Priority: 0
Node: ip-10-0-75-50.us-east-2.compute.internal/10.0.75.50
Start Time: Fri, 17 Jun 2022 16:12:58 +0530
Labels: <none>
Annotations: cpu-load-balancing.crio.io: disable
k8s.ovn.org/pod-networks:
{"default":{"ip_addresses":["10.128.0.46/23"],"mac_address":"0a:58:0a:80:00:2e","gateway_ips":["10.128.0.1"],"ip_address":"10.128.0.46/23"...
k8s.v1.cni.cncf.io/network-status:
[{
"name": "ovn-kubernetes",
"interface": "eth0",
"ips": [
"10.128.0.46"
],
"mac": "0a:58:0a:80:00:2e",
"default": true,
"dns": {}
}]
k8s.v1.cni.cncf.io/networks-status:
[{
"name": "ovn-kubernetes",
"interface": "eth0",
"ips": [
"10.128.0.46"
],
"mac": "0a:58:0a:80:00:2e",
"default": true,
"dns": {}
}]
Status: Running
IP: 10.128.0.46
IPs:
IP: 10.128.0.46
Containers:
sise:
Container ID: cri-o://ef887311fad1f4d6b9d73ec399d7f7f40735b010ba241f79cd6637125d4a6fb0
Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:aa8d1daf3432d8dedc5c56d94aeb1f25301bce6ccd7d5406fb03a00be97374ad
Image ID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:aa8d1daf3432d8dedc5c56d94aeb1f25301bce6ccd7d5406fb03a00be97374ad
Port: <none>
Host Port: <none>
Command:
bin/bash
-c
sleep 10000
State: Running
Started: Fri, 17 Jun 2022 16:13:00 +0530
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-hp9j5 (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
kube-api-access-hp9j5:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
ConfigMapName: openshift-service-ca.crt
ConfigMapOptional: <nil>
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 12s default-scheduler Successfully assigned default/twocontainers to ip-10-0-75-50.us-east-2.compute.internal by ip-10-0-75-50
Normal AddedInterface 10s multus Add eth0 [10.128.0.46/23] from ovn-kubernetes
Normal Pulled 10s kubelet Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:aa8d1daf3432d8dedc5c56d94aeb1f25301bce6ccd7d5406fb03a00be97374ad" already present on machine
Normal Created 10s kubelet Created container sise
Normal Started 10s kubelet Started container sise
Peter, can you please take a look? Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069 @pehunt please see the prior comment, did this make OCP v4.9 |