Description of problem: OpenShift allows users to create in-memory EmptyDir volumes for their pods (by setting the option "medium: memory"), which translates into a tmpfs file system mounted inside the container. However, the API does not allow to limit the size of these tmpfs filesystems, which defaults to the half of the node RAM (as usual in Linux). This situation could lead to a memory exhaustion on the node where the pods are running. Version-Release number of selected component (if applicable): 3.3.0 How reproducible: Always Steps to Reproduce: 1. Start pod with EmptyDir (medium: memory) and set a memory limit. 2. Rsh to the pod and use dd to create a file on the EmptyDir bigger than the pod's memory limit. 3. The pod will be restarted. 4. Repeat 1-2. Actual results: The EmptyDir keeps all the files and based on kernel documentation [1][2], that could potentially lead to memory exhaustion on the host. Expected results: - It should be possible to set a fixed size for an in-memory EmptyDir. - It should be possible to limit the size of all the in-memory EmptyDir volumes defined in each user pod via limits (as we currently can limit RAM and CPU). - It should be possible to limit how many GB can a user allocate for in-memory EmptyDir volumes via the quota. Additional info:
The 2 references: [1] https://www.kernel.org/doc/Documentation/cgroup-v1/memory.txt [2] https://www.kernel.org/doc/Documentation/filesystems/tmpfs.txt
To address this issue, we need pod level cgroup hierarchy planned in Kubernetes 1.6.
I just verified that having pod cgroups enabled on the node in 3.6 (it is enabled by default) enforces the memory limit wrt memory backed emptydirs. [root@test ~]# cat busybox.yaml apiVersion: v1 kind: Pod metadata: name: busybox spec: containers: - name: busybox image: busybox resources: limits: memory: 1Gi cpu: 1 command: - dd - if=/dev/zero - of=/mnt/zero - bs=1M - count=2000 volumeMounts: - name: myvol mountPath: /mnt terminationGracePeriodSeconds: 0 volumes: - name: myvol emptyDir: medium: Memory [root@test ~]# oc create -f busybox.yaml pod "busybox" created [root@test ~]# oc describe pod | grep -A 5 "Last State" Last State: Terminated Reason: OOMKilled Exit Code: 137 Started: Fri, 05 May 2017 17:10:42 +0000 Finished: Fri, 05 May 2017 17:10:42 +0000 Ready: False # mount | grep myvol tmpfs on /var/lib/origin/openshift.local.volumes/pods/c296d664-31b5-11e7-a96c-fa163e71bc65/volumes/kubernetes.io~empty-dir/myvol type tmpfs (rw,relatime,seclabel) [root@test ~]# cd /var/lib/origin/openshift.local.volumes/pods/c296d664-31b5-11e7-a96c-fa163e71bc65/volumes/kubernetes.io~empty-dir/myvol [root@test myvol]# ls -alh total 1023M drwxrwsrwt. 2 root 1000040000 60 May 5 17:10 . drwxr-xr-x. 3 root root 19 May 5 17:10 .. -rw-r--r--. 1 1000040000 1000040000 1023M May 5 17:11 zero Even though the pod tries to write a 2Gi file, it is OOMKilled when the file reached 1Gi in size i.e. the memory limit set on the container.
Upstream PR: https://github.com/kubernetes/kubernetes/pull/41349
Included in Origin 1.6.1 rebase: https://github.com/openshift/origin/pull/13653
Tested on OCP3.6 (openshift v3.6.79, kubernetes v1.6.1+5115d708d7, etcd 3.1.0) EmptyDir won't exhaust memory. Move the bug to VERIFIED, thanks.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:3188
you can use sizeLimit and i verified it already. even though after entering container, df -h , you see the emptydir is 128G, but you could only use the space under the sizeLimit. apiVersion: v1 kind: Pod metadata: name: busybox spec: containers: - name: busybox image: gcr.io/google_containers/busybox:1.24 imagePullPolicy: IfNotPresent resources: limits: memory: 1Gi cpu: 1 command: ['sh', '-c', 'echo Hello Kubernetes!>/test-pd/mfltest.txt && sleep 3600' ] ports: - containerPort: 80 volumeMounts: - mountPath: /test-pd name: test-volume volumes: - name: test-volume emptyDir: medium: Memory sizeLimit: "1M" after enter the container, you can verify by typing: dd if=/dev/zero of=/test-pd/zero bs=1M count=10 the container exit. if you have further question, please let met know fanlong_meng