Description of problem: As part of 4.7 Epic SDN-915 (https://issues.redhat.com/browse/SDN-915), need to enable the "DownwardAPIHugePages" feature gate. To run DPDK applications within a container, the container needs to know how much hugepage memory has been request. Until now, this information was not available to the container. The following KEP and PR allows the request and limt hugepage data to be passed to the container via the Downward API: * K8s PR: https://github.com/kubernetes/kubernetes/pull/86102 * K8s KEP: https://github.com/kubernetes/enhancements/pull/2055/files This new feature was added as an alpha feature in K8s and need to be enabled. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
@eparis The associated Epic, SDN-915, is not a blocker Epic, so marked this as not a blocker and dropped the priority/severity down to high. Only marked it that high because this moved past feature freeze waiting on K8s 1.20 to be pulled in.
This BZ was written in OCP 4.7 timeframe to attempt to turn on a FeatureGate after the 4.7 Feature Freeze date. BZs are needed after Feature Freeze and this functionality was after Feature Freeze because K8s 1.20 went in to OCP 4.7 after Feature Freeze and feature depends on K8s 1.20. This ended up not making OCP 4.7. https://github.com/openshift/api/pull/821 has now been merged to OCP 4.8 so FeatureGate is now enabled.
This BZ will be tested as a feature under EPIC https://issues.redhat.com/browse/SDN-1491.
Researched related k8s and openshift docs about hugepage resource and its support in downward API. Tested in latest 4.8.0-0.nightly-2021-03-29-000904 env: $ cat pod-hugepages-example.yaml apiVersion: v1 kind: Pod metadata: name: hugepages-example spec: containers: - name: abc image: gcr.io/google_containers/busybox:1.24 resources: limits: hugepages-2Mi: 100Mi hugepages-1Gi: 2Gi cpu: 200m requests: cpu: 20m hugepages-2Mi: 100Mi env: - name: MY_LIMITS_HP_1GI valueFrom: resourceFieldRef: resource: limits.hugepages-1Gi - name: MY_REQUESTS_HP_2MI valueFrom: resourceFieldRef: resource: requests.hugepages-2Mi - name: MY_REQUESTS_CPU valueFrom: resourceFieldRef: resource: requests.cpu $ oc create -f pod-hugepages-example.yaml The Pod "hugepages-example" is invalid: * spec.containers[0].env[0].valueFrom.resourceFieldRef.resource: Unsupported value: "limits.hugepages-1Gi": supported values: "limits.cpu", "limits.ephemeral-storage", "limits.memory", "requests.cpu", "requests.ephemeral-storage", "requests.memory" * spec.containers[0].env[1].valueFrom.resourceFieldRef.resource: Unsupported value: "requests.hugepages-2Mi": supported values: "limits.cpu", "limits.ephemeral-storage", "limits.memory", "requests.cpu", "requests.ephemeral-storage", "requests.memory" That is, hugepages via downward API is still not supported. Then check the code whether having bump: $ oc adm release info --commits registry.ci.openshift.org/ocp/release:4.8.0-0.nightly-2021-03-29-000904 ... hyperkube https://github.com/openshift/kubernetes 29a606d... ... $ cd /data/src/github.com/openshift/kubernetes $ git pull $ git checkout -b 4.8.0-0.nightly-2021-03-29-000904 29a606d $ vi vendor/github.com/openshift/api/config/v1/types_feature.go var defaultFeatures = &FeatureGateEnabledDisabled{ ... "SCTPSupport", // sig-network, ccallend }, The code of the latest env does not yet bump the PR's code. So moving to Assigned for bump.
At the time feature was tested (Tested in latest 4.8.0-0.nightly-2021-03-29-000904), the changes were in openshift/api, but they had not been propagated to vendor of openshift/cluster-kube-apiserver-operator. That has now been merged. This can now be tested.
Verified in 4.8.0-0.nightly-2021-04-13-171608, now the pod using hugepages as downward API can be successfully created: $ oc create -f pod-hugepages-example.yaml pod/hugepages-example created Wanted to run oc rsh the pod and check the env vars from hugepages as downward API, but the pod is Pending due to "Insufficient hugepages-1Gi" "Insufficient hugepages-2M". So skipping this check, leaving it to the test in the epic of comment 4.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438