Description of problem: In a three node OpenShift Container Platform 4.5.6_1505 cluster the following pod fails to be scheduled due to insufficient ephemeral-storage ---- Begin YAML Snippet ---- apiVersion: v1 kind: Pod metadata: name: nginx labels: name: nginx spec: containers: - name: nginx image: nginx schedulerName: default-scheduler ports: - containerPort: 80 resources: requests: ephemeral-storage: 4096M limits: ephemeral-storage: 4096M initContainers: - name: init-myservice image: busybox:1.28 command: ['sh', '-c', "echo waiting for myservice; sleep 7;"] resources: requests: cpu: 500m ephemeral-storage: 2M memory: 1024M ---- End YAML Snippet ---- Error: ####### Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling <unknown> default-scheduler 0/3 nodes are available: 3 Insufficient ephemeral-storage. Warning FailedScheduling <unknown> default-scheduler 0/3 nodes are available: 3 Insufficient ephemeral-storage. ------- ####### From describe node ####### Allocatable: cpu: 3910m ephemeral-storage: 100275095474 ####### It was working in OCP 4.3 and OCP 4.4 And it does work if the ephemeral-storage references are removed from the "initContainers" definition but leaving the ephemeral-storage reference on the regular container. Version-Release number of selected component (if applicable): OCP 4.5.6_1505 How reproducible: In customer environment, every time. Steps to Reproduce: 1. Create a pod with the above described YAML (oc create -f <file>) 2. 3. Actual results: Fail to deploy, Insufficient ephemeral-storage Expected results: Pod should start as the nodes have ephemeral-storage available Additional info:
Not able to reproduce it. Tested over vanilla clusters of the following versions: - 4.6.0-0.ci-2020-10-06-161728 - 4.5.14 ``` Allocated resources: (Total limits may be over 100 percent, i.e., overcommitted.) Resource Requests Limits -------- -------- ------ cpu 844m (24%) 0 (0%) memory 2313748480 (14%) 512Mi (3%) ephemeral-storage 4096M (3%) 4096M (3%) hugepages-1Gi 0 (0%) 0 (0%) hugepages-2Mi 0 (0%) 0 (0%) attachable-volumes-aws-ebs 0 0 ``` Can you reproduce it in your environment and share the cluster?
@jnordell Can this be reopened? To recreate this bug you need to have the LocalStorageCapacityIsolation feature-gate disabled and initContainers with ephemeral-storage resource requests or limits. This problem was found on IBM Cloud Red Hat OpenShift which apparently is out of sync with the Red Hat offering - the customer was using a feature which they thought was enabled but in fact was not. Even though the limits obviously won't be enforced it presents a migration problem since such pods are not schedulable in 4.5. We have duplicated this behavior in Kubernetes 1.18 through 1.20. Kubernetes 1.17 does not have this behavior. Also opened https://github.com/kubernetes/kubernetes/issues/96083. IBM Cloud Red Hat OpenShift Service Development
@jnordell Following up on @jmcmeek.com request - can we get this bugzilla reopened now that we know the exact recipe to reproduce. A fix has already been merged into the K8s upstream and backported for 1.18 and 1.19 with a PR tag of priority/critical-urgent. It will make its way into K8s upstream patch 1.18.11 and 1.19.4 per this issue (https://github.com/p7t/actus/issues/265). I would like to get this reproduced and understand how this can flow into the OpenShift 4.5 and 4.6 release streams after you are able to recreate the issue.
The fix gets picked into 4.7 with the next rebase of 1.20. Once done, this issue can be cloned for 4.6 and 4.5.
1.20.0 rebased is merged https://github.com/openshift/kubernetes/pull/471. https://github.com/openshift/kubernetes/blob/master/pkg/scheduler/framework/types.go#L394-L397
Hi Jonas, I see the customer case is closed. How severe the issue is for the customer? Will it be sufficient to backport the fix to 4.6? Or, is 4.5 still relevant?
I am not sure. Maybe @jmcmeek.com or @joshisa.com could answer this?
@jchaloup @jnordell . Since this fix enables our clients to use the IBM Cloud Managed OpenShift, backports for both 4.5 and 4.6 are needed so that IBM Cloud can eventually adopt. Rationale: 4.5 will be supported on IBM Cloud (tentatively) until Aug 2021 and 4.6 has extended update support. This leaves a large exposure window with clients interested in IBM Cloud. Without backports, support on IBM Cloud has an awkward gap between 4.4 (which will go out of support in 1H 2021) and 4.7 which will impact solutioning and client experience on IBM Cloud. Thanks.
Hi Jonas, I am trying to reproduce the bug on a 4.5 cluster but unable to do so, Below are the steps i performed, do i need to perform any additional steps to be able to reproduce the issue ? steps performed: ================= 1) Install 4.5 nightly build 2) oc create -f /tmp/ephermal.yaml [knarra@knarra openshift-client-linux-4.5.0-0.nightly-2021-01-05-234719]$ ./oc describe node ip-10-0-217-111.us-east-2.compute.internal | grep "ephemeral-storage" ephemeral-storage: 125277164Ki ephemeral-storage: 114381692328 ephemeral-storage 4096M (3%) 4096M (3%) yaml definition: ========================== [knarra@knarra openshift-client-linux-4.5.0-0.nightly-2021-01-05-234719]$ cat /tmp/ephermal.yaml apiVersion: v1 kind: Pod metadata: name: nginx labels: name: nginx spec: containers: - name: nginx image: quay.io/openshifttest/nginx@sha256:3936fb3946790d711a68c58be93628e43cbca72439079e16d154b5db216b58da schedulerName: default-scheduler ports: - containerPort: 80 resources: requests: ephemeral-storage: 4096M limits: ephemeral-storage: 4096M initContainers: - name: init-myservice image: quay.io/openshifttest/busybox@sha256:afe605d272837ce1732f390966166c2afff5391208ddd57de10942748694049d command: ['sh', '-c', "echo waiting for myservice; sleep 7;"] resources: requests: cpu: 500m ephemeral-storage: 2M memory: 1024M
@knarra: You need to also make sure that you have the LocalStorageCapacityIsolation feature-gate disabled on the cluster. See Comment9: https://bugzilla.redhat.com/show_bug.cgi?id=1886294#c9 . If that is disabled, then success can be validated by the successful scheduling and running of the ephermal.yaml pod that you have created. Hope this helps.
(In reply to Sanjay Joshi from comment #20) > @knarra: You need to also make sure that you have the > LocalStorageCapacityIsolation feature-gate disabled on the cluster. See > Comment9: https://bugzilla.redhat.com/show_bug.cgi?id=1886294#c9 . If that > is disabled, then success can be validated by the successful scheduling and > running of the ephermal.yaml pod that you have created. Hope this helps. Thanks for the quick reply, i will try to enable this feature-gate and try again, will clear needinfo on jonas as i have got the required input.
(In reply to RamaKasturi from comment #21) > (In reply to Sanjay Joshi from comment #20) > > @knarra: You need to also make sure that you have the > > LocalStorageCapacityIsolation feature-gate disabled on the cluster. See > > Comment9: https://bugzilla.redhat.com/show_bug.cgi?id=1886294#c9 . If that > > is disabled, then success can be validated by the successful scheduling and > > running of the ephermal.yaml pod that you have created. Hope this helps. > > Thanks for the quick reply, i will try to enable this feature-gate and try > again, will clear needinfo on jonas as i have got the required input. :-). Cool. Just to confirm - the feature-gate needs to be "disabled" on the cluster. When the feature-gate is enabled (which I think is the default on OpenShift clusters), things work fine and the K8s bug does not surface (e.g. the example pod schedules and runs fine). IBM Cloud has chosen to disable this feature-gate for long term update/maintenance considerations.
Verified bug with payload below and do not see the issue happening. Below are the steps followed to verify the bug. Test steps followed to verify the bug: ======================================== 1) login to one master node, edit /etc/kubernetes/manifests/kube-scheduler-pod.yaml and add the feature gate LocalStorageCapacityIsolation=false to --feature-gates=<line> args, wait for kube-scheduler pod on that node to restart. 2) Repeat the step for every other master. 3) Now create a pod by running the command oc create -f /tmp/ephermal.yaml file [knarra@knarra openshift-client-linux-4.5.0-0.nightly-2021-01-05-234719]$ cat /tmp/ephermal.yaml apiVersion: v1 kind: Pod metadata: name: nginx labels: name: nginx spec: containers: - name: nginx image: quay.io/openshifttest/nginx@sha256:3936fb3946790d711a68c58be93628e43cbca72439079e16d154b5db216b58da schedulerName: default-scheduler ports: - containerPort: 80 resources: requests: ephemeral-storage: 4096M limits: ephemeral-storage: 4096M initContainers: - name: init-myservice image: quay.io/openshifttest/busybox@sha256:afe605d272837ce1732f390966166c2afff5391208ddd57de10942748694049d command: ['sh', '-c', "echo waiting for myservice; sleep 7;"] resources: requests: cpu: 500m ephemeral-storage: 2M memory: 1024M 4) we can see that pod is scheduled and running fine. [knarra@knarra openshift-client-linux-4.7.0-0.nightly-2021-01-07-034013]$ ./oc get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx 1/1 Running 0 12s 10.129.2.61 ip-10-0-142-44.us-east-2.compute.internal <none> <none> Tried steps from 1 to 4 on a 4.5 cluster and i could reproduce the issue where pod is stuck in pending state. [knarra@knarra openshift-client-linux-4.5.0-0.nightly-2021-01-05-234719]$ ./oc describe pod nginx Name: nginx Namespace: default Priority: 0 Node: <none> Labels: name=nginx Annotations: <none> Status: Pending IP: IPs: <none> Init Containers: init-myservice: Image: quay.io/openshifttest/busybox@sha256:afe605d272837ce1732f390966166c2afff5391208ddd57de10942748694049d Port: <none> Host Port: <none> Command: sh -c echo waiting for myservice; sleep 7; Requests: cpu: 500m ephemeral-storage: 2M memory: 1024M Environment: <none> Mounts: /var/run/secrets/kubernetes.io/serviceaccount from default-token-lp7r9 (ro) Containers: nginx: Image: quay.io/openshifttest/nginx@sha256:3936fb3946790d711a68c58be93628e43cbca72439079e16d154b5db216b58da Port: 80/TCP Host Port: 0/TCP Limits: ephemeral-storage: 4096M Requests: ephemeral-storage: 4096M Environment: <none> Mounts: /var/run/secrets/kubernetes.io/serviceaccount from default-token-lp7r9 (ro) Conditions: Type Status PodScheduled False Volumes: default-token-lp7r9: Type: Secret (a volume populated by a Secret) SecretName: default-token-lp7r9 Optional: false QoS Class: Burstable Node-Selectors: <none> Tolerations: node.kubernetes.io/memory-pressure:NoSchedule node.kubernetes.io/not-ready:NoExecute for 300s node.kubernetes.io/unreachable:NoExecute for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 4h16m default-scheduler 0/6 nodes are available: 6 Insufficient ephemeral-storage. [knarra@knarra openshift-client-linux-4.5.0-0.nightly-2021-01-05-234719]$ ./oc get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx 0/1 Pending 0 4h17m <none> <none> <none> <none> Based on the above moving bug to verified state.
Jonas, the fix is not going to be backported to 4.5. 4.5 release is in maintenance phase now and only urgent/high bugs and CVEs are fixed.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5633