Description of problem: This uses to track to hugepages feature for sandboxed containers including the upstream merging (kata 2.0) and then backport to the downstream. Following the example in https://docs.openshift.com/container-platform/4.5/scalability_and_performance/what-huge-pages-do-and-how-they-are-consumed-by-apps.html results in creating pod failures. Version-Release number of selected component (if applicable): OCP 4.8 How reproducible: always Steps to Reproduce: 1) make sure the node has 100M hugepages preallocated. $ oc describe nodes | grep huge hugepages-2Mi: 100Mi $ cat <<EOF> deploy/huge.yaml apiVersion: v1 kind: Pod metadata: generateName: hugepages-volume- spec: containers: - securityContext: privileged: true image: rhel7:latest command: - sleep - inf name: example volumeMounts: - mountPath: /dev/hugepages name: hugepage resources: limits: hugepages-2Mi: 100Mi memory: "1Gi" cpu: "1" volumes: - name: hugepage emptyDir: medium: HugePages runtimeClassName: kata-oc EOF $ oc create -f huge.yaml $ oc describe pod hugepages-volume-4x5nn ... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 115s default-scheduler Successfully assigned default/hugepages-volume-4x5nn to kataqe-nmfh9-worker-rv2nh Normal AddedInterface 108s multus Add eth0 [14.128.2.19/23] Normal Pulling 13s (x6 over 99s) kubelet, kataqe-nmfh9-worker-rv2nh Pulling image "rhel7:latest" Normal Pulled 11s (x6 over 89s) kubelet, kataqe-nmfh9-worker-rv2nh Successfully pulled image "rhel7:latest" Warning Failed 8s (x6 over 84s) kubelet, kataqe-nmfh9-worker-rv2nh Error: CreateContainer failed: Timeout reached after 3s waiting for device 0:0:0:0/block: unknown
For a record, Pradipta mentioned there are out-of-tree repos for kata 1.x only. https://github.com/bpradipt/agent/tree/hugepages https://github.com/bpradipt/runtime/tree/hugepages
OpenShift has moved to Jira for its defect tracking! This bug can now be found in the OCPBUGS project in Jira. https://issues.redhat.com/browse/OCPBUGS-8829